Base R reshape() versus tidyverse

Question

Recently I was looking around to find a quick way to create a long data frame out of a wide data frame containing patient data with multiple measurements per patient (let's say a heart rate, blood pressure, and saturation score for up to five days). While googling around I had expected to find numerous solutions from the tidyverse and/or the reshape2 package. However, I actual found this gem from base-R.

Now I cannot let go of trying to find an equally nice "one-liner" using one or more of the packages in the tidyverse and/or reshape2. So far, nothing as nice comes up and I'd appreciate a hint. Note that this question is explicitly about a pretty/neat, clever solution rather than finding the actual solution.

Data from linked post:

id <- paste('x', "1.", 1:10, sep="")
set.seed(10)
DF <- data.frame(id, trt=sample(c('cnt', 'tr'), 10, T), work.T1=runif(10),
    play.T1=runif(10), talk.T1=runif(10), total.T1=runif(10),
    work.T2=runif(10), play.T2=runif(10), talk.T2=runif(10), 
    total.T2=runif(10))

score 2 · Accepted Answer · answered Jun 20 '17 at 08:13

I don't think there is a tidyverse solution with a single function call, but a good solution is not that complicated either. We need to gather first, then separate the time and keys, and then spread it back again.

DF %>% 
    gather(key, val, -id, -trt) %>% 
    separate(key, c('key', 'time')) %>% 
    spread(key, val)

      id trt time       play       talk      total       work
1   x1.1  tr   T1 0.86472123 0.53559704 0.27548386 0.65165567
2   x1.1  tr   T2 0.03188816 0.07557029 0.86138244 0.35432806
3  x1.10 cnt   T1 0.35589774 0.50050323 0.80154700 0.83613414
4  x1.10 cnt   T2 0.21913855 0.20795168 0.17015172 0.50528560
5   x1.2 cnt   T1 0.61535242 0.09308813 0.22890394 0.56773775
6   x1.2 cnt   T2 0.11446759 0.53442678 0.46439198 0.93643254
7   x1.3 cnt   T1 0.77510990 0.16980304 0.01443391 0.11350898
8   x1.3 cnt   T2 0.46893548 0.64135658 0.22286743 0.24586639
9   x1.4  tr   T1 0.35556869 0.89983245 0.72896456 0.59592531
10  x1.4  tr   T2 0.39698674 0.52573932 0.62354960 0.47314146
11  x1.5 cnt   T1 0.40584997 0.42263761 0.24988047 0.35804998
12  x1.5 cnt   T2 0.83361919 0.03928139 0.20364770 0.19156087
13  x1.6 cnt   T1 0.70664691 0.74774647 0.16118328 0.42880942
14  x1.6 cnt   T2 0.76112174 0.54585984 0.01967341 0.58322197
15  x1.7 cnt   T1 0.83828767 0.82265258 0.01704265 0.05190332
16  x1.7 cnt   T2 0.57335645 0.37276310 0.79799301 0.45947319
17  x1.8 cnt   T1 0.23958913 0.95465365 0.48610035 0.26417767
18  x1.8 cnt   T2 0.44750805 0.96130241 0.27431890 0.46743405
19  x1.9  tr   T1 0.77077153 0.68544451 0.10290017 0.39879073
20  x1.9  tr   T2 0.08380201 0.25734157 0.16660910 0.39983256

Ah, that's it, now I realize that by removing the vars that are ID "cruft", gather actually gathers the whole remaining lot under one heading. That can then be separated out. I was looking too much for a solution that already specified what columns would go under which new column. — Paul Lemmens, Jun 20 '17 at 09:30

Base R reshape() versus tidyverse

1 Answers1