0

I have a wide table with only 1 row. Each column has a different name. I would like to combine 3 of the columns to form 1 "date" column and then transform the data to create a long table. Also my data table names will vary. For example I might have one table that has only 2 "ernMvx" and another table that has 20 "ernMvx" so I am using grep.

#data
dput(x)
structure(list(ernDate1 = "1/29/2019", ernDate2 = "11/1/2018", 
    ernDate3 = "7/31/2018", ernMv1 = 6.8335, ernMv2 = -6.6331, 
    ernMv3 = 5.891, ernStraPct1 = 6.8304, ernStraPct2 = 7.074, 
    ernStraPct3 = 5.2632), row.names = c(NA, -1L), class = "data.frame")

print(x)
ernDate1  ernDate2  ernDate3 ernMv1  ernMv2 ernMv3 ernStraPct1 ernStraPct2 ernStraPct3
1 1/29/2019 11/1/2018 7/31/2018 6.8335 -6.6331  5.891      6.8304       7.074      5.2632

date = x %>% select(grep("ernDate", names(x)))
ernMv = x %>% select(grep("ernMv",names(x)))
ernStraPct = x%>% select(grep("ernStra",names(x)))

new.data = as.data.frame(cbind(unlist(date), unlist(ernMv), unlist(ernStraPct)))
names(new.data) = c("date", "ernMv", "ernStraPct")
rownames(new.data) = c(1:length(new.data$date))
print(new.data)
             date   ernMv ernStraPct
      1 1/29/2019  6.8335     6.8304
      2 11/1/2018 -6.6331      7.074
      3 7/31/2018   5.891     5.2632

This is my desired output, but it seems very tedious. Is there a way to do this quicker using a tidy verse function?I tried using reshape2::melt, but I seem to be having difficulties using it for a 1 row table. Thank you

Cettt
  • 9,336
  • 7
  • 25
  • 48
Jordan Wrong
  • 933
  • 5
  • 17

2 Answers2

3

Here is a fast data.table option making use of the patterns function to match column names

library(data.table)
melt(
    as.data.table(x),
    measure = patterns("ernDate", "ernMv", "ernStraPct"),
    value.name = c("date", "ernMv", "ernStraPct"))
#   variable      date   ernMv ernStraPct
#1:        1 1/29/2019  6.8335     6.8304
#2:        2 11/1/2018 -6.6331     7.0740
#3:        3 7/31/2018  5.8910     5.2632 

Or even more succinct (thanks @markus)

cols <- unique(sub("\\d$", "", names(x)))
melt(as.data.table(x), measure.vars = patterns(cols), value.name = cols)

Or a tidyverse option using a positive look-ahead to separate entries into a column name and number

library(tidyverse)
x %>%
    gather(k, v) %>%
    separate(k, c("col", "row"), sep = "(?=\\d)") %>%
    spread(col, v)
#  row   ernDate   ernMv ernStraPct
#1   1 1/29/2019  6.8335     6.8304
#2   2 11/1/2018 -6.6331      7.074
#3   3 7/31/2018   5.891     5.2632
Maurits Evers
  • 42,255
  • 4
  • 27
  • 51
1

I'm assuming each column name ends with a number that can be interpreted, as a record ID.

x %>%
  gather(name, value) %>%
  mutate(id = gsub('(.+)([0-9]+)', '\\2', name),
         name = gsub('(.+)([0-9]+)', '\\1', name)) %>%
  spread(name, value)
kwiscion
  • 486
  • 2
  • 6
  • thanks kwiscion. This worked perfectly, but Mauritis had a more indepth answer and i like the datatable approach so I had to give him the check mark. Thanks for your help – Jordan Wrong Mar 06 '19 at 22:52