0

I have a dataframe as follows:

dataDF <- data.frame(
          id = 1:5,
          to_choose = c('red', 'blue', 'red', 'green', 'yellow'),
          red_value = c(1,2,3,4,5),
          blue_value = c(6,7,8,9,10),
         yellow_value = c(11,12,13,14,15)
)

id to_choose red_value blue_value yellow_value
 1       red         1          6           11
 2      blue         2          7           12
 3       red         3          8           13
 4     green         4          9           14
 5    yellow         5         10           15

I want to create a new column value, which is the value from the appropriate column based on the to_choose column.

I could do this with an ifelse as follows

mutate(dataDF,
   value = ifelse(to_choose == 'red', red_value,
                  ifelse(to_choose == 'blue', blue_value,
                         ifelse(to_choose == 'yellow', yellow_value, NA))))

To give

id to_choose red_value blue_value yellow_value value
 1       red         1          6           11     1
 2      blue         2          7           12     7
 3       red         3          8           13     3
 4     green         4          9           14    NA
 5    yellow         5         10           15    15

But if there a simpler one line way of doing this along the lines of

mutate(dataDF, value = paste(to_choose, 'value', sep = '_'))
kath
  • 7,191
  • 15
  • 31
user1165199
  • 5,057
  • 12
  • 40
  • 57

2 Answers2

4
dataDF %>% 
  gather(var, value , 3:5) %>%   
  mutate(var = gsub('_value', '', var))  %>% 
  filter(to_choose == var)
kath
  • 7,191
  • 15
  • 31
Mouad_Seridi
  • 2,401
  • 12
  • 25
2

A base R approach using mapply

dataDF$value <- mapply(function(x, y) if(length(y) > 0)  dataDF[x, y] else NA, 
      1:nrow(dataDF), sapply(dataDF$to_choose, function(x) grep(x, names(dataDF))))


dataDF

#  id to_choose red_value blue_value yellow_value value
#1  1       red         1          6           11     1
#2  2      blue         2          7           12     7
#3  3       red         3          8           13     3
#4  4     green         4          9           14    NA
#5  5    yellow         5         10           15    15

The idea is to get the appropriate row and column indices to subset upon. Row indices we are already know that we need to get value for each row of the dataframe. As far as getting the appropriate column is concerned we use grep over to_choose to find the column index from where the value needs to be extracted.

Ronak Shah
  • 286,338
  • 16
  • 97
  • 143