dplyr: create new column with values from other specified columns

Question

I have a tibble:

library(tibble)
library(dplyr)

(
  data <- tibble(
    a = 1:3,
    b = 4:6,
    mycol = c('a', 'b', 'a')
  )
)
#> # A tibble: 3 x 3
#>       a     b mycol
#>   <int> <int> <chr>
#> 1     1     4 a    
#> 2     2     5 b    
#> 3     3     6 a

Using dplyr::mutate I'd like to create a new column called value which uses a value from either column a or b, depending on which column name is specified in the mycol column.

(
  desired <- tibble(
    a = 1:3,
    b = 4:6,
    mycol = c('a', 'b', 'a'),
    value = c(1, 5, 3)
  )
)
#> # A tibble: 3 x 4
#>       a     b mycol value
#>   <int> <int> <chr> <dbl>
#> 1     1     4 a         1
#> 2     2     5 b         5
#> 3     3     6 a         3

Here we're just using the values from column a all the time.

data %>%
  mutate(value = a)
#> # A tibble: 3 x 4
#>       a     b mycol value
#>   <int> <int> <chr> <int>
#> 1     1     4 a         1
#> 2     2     5 b         2
#> 3     3     6 a         3

Here we're just assigning the values of mycol to the new column rather than getting the values from the appropriate column.

data %>%
  mutate(value = mycol)
#> # A tibble: 3 x 4
#>       a     b mycol value
#>   <int> <int> <chr> <chr>
#> 1     1     4 a     a    
#> 2     2     5 b     b    
#> 3     3     6 a     a

I've tried various combinations of !!, quo(), etc. but I don't fully understand what's going on under the hood in terms of NSE.

@Jaap has marked this as a duplicate but I'd still like to see a dplyr/tidyverse approach using NSE rather than using base R if possible.

@Jaap I'd like to see a dplyr/tidyverse approach using NSE if possible, rather than using base R. I've amended the question to reflect this. Would you be able to remove the duplicate status? — Greg, Sep 01 '18 at 11:49
At the moment I'm not going to do that as there is a tidyverse approach in the linked Q&A. Moreover, you said: *"I've tried various combinations of `!!`, `quo()`"*, but you fail to specify those attempts. See also the Q&A on how to give a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610). — Jaap, Sep 01 '18 at 12:26

missuse · Answer 1 · 2018-09-01T11:29:33.113

1

Here is one approach:

df %>%
  mutate(value = ifelse(mycol == "a", a, b))
#output
# A tibble: 3 x 4
      a     b mycol value
  <int> <int> <chr> <int>
1     1     4 a         1
2     2     5 b         5
3     3     6 a         3

and here is a more general way in base R

df$value <- diag(as.matrix(df[,df$mycol]))

more complex example:

df <- tibble(
    a = 1:4,
    b = 4:7,
    c = 5:8,
    mycol = c('a', 'b', 'a', "c"))

df$value <- diag(as.matrix(df[,df$mycol]))
#output
# A tibble: 4 x 5
      a     b     c mycol value
  <int> <int> <int> <chr> <int>
1     1     4     5 a         1
2     2     5     6 b         5
3     3     6     7 a         3
4     4     7     8 c         8

edited Sep 01 '18 at 11:29

answered Sep 01 '18 at 11:17

missuse

16,776
3
15
33

Thanks. This is just a simple example but I'd like to use the same method for a larger number of columns without specifying them manually. Is there a more automatic solution? – Greg Sep 01 '18 at 11:22
@Greg I have added a more general base R approach. – missuse Sep 01 '18 at 11:32
Thanks @missuse. It would still be great to see a tidyverse approach using NSE if possible but I've upvoted your answer. I've also amended the question to make this clearer. – Greg Sep 01 '18 at 11:48

dplyr: create new column with values from other specified columns

1 Answers1