4

In R, I have data where each person has multiple session dates, and the scores on some tests, but this is all in one row. I would like to change it so I have multiple rows with the persons info, but only one of the session dates and corresponding test scores, and do this for every person. Also, each person may have completed different number of sessions.

Ex:

ID  Name  Session1Date  Score  Score  Session2Date  Score  Score
23  sjfd  20150904      2      3      20150908      5      7
28  addf  20150905      3      4      20150910      6      8

To:

ID  Name  SessionDate  Score  Score
23  sjfd  20150904     2      3     
23  sjfd  20150908     5      7
28  addf  20150905     3      4
28  addf  20150910     6      8
Swanny
  • 189
  • 11

2 Answers2

6

You can use melt from the devel version of data.table ie. v1.9.5. It can take multiple 'measure' columns as a list. Instructions to install are here

library(data.table)#v1.9.5+
melt(setDT(df1), measure = patterns("Date$", "Score(\\.2)*$", "Score\\.[13]"))
#   ID Name variable   value1 value2 value3
#1: 23 sjfd        1 20150904      2      3
#2: 28 addf        1 20150905      3      4
#3: 23 sjfd        2 20150908      5      7
#4: 28 addf        2 20150910      6      8

Or using reshape from base R, we can specify the direction as 'long' and varying as a list of column index

 res <- reshape(df1, idvar=c('ID', 'Name'), varying=list(c(3,6), c(4,7), 
         c(5,8)), direction='long')
 res  
 #            ID Name time Session1Date Score Score.1
 #23.sjfd.1 23 sjfd    1     20150904     2       3
 #28.addf.1 28 addf    1     20150905     3       4
 #23.sjfd.2 23 sjfd    2     20150908     5       7
 #28.addf.2 28 addf    2     20150910     6       8

If needed, the rownames can be changed

 row.names(res) <- NULL 

Update

If the columns follow a specific order i.e. 3rd grouped with 6th, 4th with 7th, 5th with 8th, we can create a matrix of column index and then split to get the list for the varying argument in reshape.

 m1 <- matrix(3:8,ncol=2)
 lst <- split(m1, row(m1))
 reshape(df1, idvar=c('ID', 'Name'), varying=lst, direction='long')
Arun
  • 108,644
  • 21
  • 263
  • 366
akrun
  • 674,427
  • 24
  • 381
  • 486
  • Is there a way to do this without typing in all the variable names because I have 625 columns? – Swanny Jun 17 '15 at 16:21
  • @Michael You can use `merged.stack` from `splitstackshape` but we need to have some patterns for the common column names or we need to know if the column follow a specific sequence – akrun Jun 17 '15 at 16:23
  • @Michael I updated a possible option if the columns are following a specific order as in the example. We create a `list` after splitting the column index of the 'measure' columns created in a matrix. – akrun Jun 17 '15 at 16:28
  • I'm sorry I'm pretty new to R, but in your answer using reshape, what is varying=list specifying, I can't seem to figure out what those numbers correspond to? – Swanny Jun 17 '15 at 16:36
  • @Michael `lst` is `list` of column index. If you type `lst` on the R console, you will see that the first lst element is `lst[[1]]` corresponds to `3 6` ie, the columns we want to group together. Likewise, we group `4 7` and `5 8` as they are separate list elements – akrun Jun 17 '15 at 16:38
  • @Michael In your original dataset, the column grouping may be different. But, you can create the list similarly if you know which columns you need to group – akrun Jun 17 '15 at 16:42
  • 1
    @Swanny, see the updated solution. You can use regex in `melt` with the help of `patterns()` function. – Arun Jun 18 '15 at 14:35
  • @Arun the `patterns()` is new for me. Great addition! (looks like I have reinstall the devel version :-) – akrun Jun 18 '15 at 14:47
  • Thanks! Added it recently. Looks cleaner – Arun Jun 18 '15 at 16:01
0

If your data frame name is data

Use this

data1 <- data[1:5]
data2 <- data[c(1,2,6,7,8)]
newdata <- rbind(data1,data2)

This works for the example you've given. You might have to change column names appropriately in data1 and data2 for a proper rbind