0

I have a data as follows:

 col1 <- c(0.1,0.2,0.0,0.5,0.6)
 col2 <- c(2,2,4,5,6)
 col3 <- c(1,4,3,4,5) 
 col4 <- c(2,3,4,4,6)
 col5 <- c(5,3,3,2,1)
 data.frame(col1,col2,col3,col4,col5)

   col1 col2 col3 col4 col5
 1  0.1    2    1    2    5
 2  0.2    2    4    3    3
 3  0.0    4    3    4    3
 4  0.5    5    4    4    2
 5  0.6    6    5    6    1

I would like to add a new column with "yes" value where in each row at least one column from col2 to column 5 is equal to 4 and "no" when the data does not meet the criteria.

So the output would look like as:

   col1 col2 col3 col4 col5 col6
 1  0.1    2    1    2    5  no
 2  0.2    2    4    3    3  yes
 3  0.0    4    3    4    3  yes
 4  0.5    5    4    4    2  yes
 5  0.6    6    5    6    1  no

here is my command:

new.df <- df %>% mutate(df, col6 = funs(ifelse(abs(vars(c(2:5) == 4),"yes", "no")

But I can not get the required output. do you have any idea how can I use dplyr, mutate and if else function to get the result?

say.ff
  • 285
  • 3
  • 13

3 Answers3

3

We can sum when cell ==4 using rowSums

library(dplyr)
df %>% mutate(col6 = ifelse(rowSums(.[,c(2:5)]==4)>0, 'yes', 'no'))

col1 col2 col3 col4 col5 col6
1  0.1    2    1    2    5   no
2  0.2    2    4    3    3  yes
3  0.0    4    3    4    3  yes
4  0.5    5    4    4    2  yes
5  0.6    6    5    6    1   no

Using @thelatemail suggestion, we can rewrite ifelse as below:

df %>% mutate(col6 = ifelse(Reduce(`|`, lapply(.[,c(2:5)], `==`, 4)), 'yes', 'no'))
A. Suliman
  • 11,665
  • 5
  • 16
  • 31
1

You can try with any

ifelse(apply(df[,-1]==4,1,any),'yes','no')
[1] "no"  "yes" "yes" "yes" "no" 
BENY
  • 258,262
  • 17
  • 121
  • 165
1

Here are some approaches:

1) rowSums Compare all but col1 to 4, sum the comparisons and for each one that is strictly positive choose yes else no.

library(dplyr)

dd %>% 
  mutate(col6 = if_else(rowSums(select(., -col1) == 4) > 0, "yes", "no"))

giving:

  col1 col2 col3 col4 col5 col6
1  0.1    2    1    2    5   no
2  0.2    2    4    3    3  yes
3  0.0    4    3    4    3  yes
4  0.5    5    4    4    2  yes
5  0.6    6    5    6    1   no

1a) Reduce A variation of this using Reduce would be:

dd %>% 
  mutate(col6 = 
    if_else(Reduce("|", (select(., -col1) == 4) %>% as.data.frame), "yes", "no"))

2) gathr Convert to long form, summarize and join back to the original data frame.

library(dplyr)
library(tidyr)

dd %>%
  mutate(seq = 1:n()) %>%
  left_join(gather(., key, value, -seq, -col1) %>%
    group_by(seq) %>%
    summarize(col6 = if_else(any(value == 4), "yes", "no")) %>%
    ungroup) %>%
  select(-seq)

If col1 has unique values, as it does in the sample data shown in the question, then it could be used in place of seq shortening the code to:

dd %>%
  left_join(gather(., key, value, -col1) %>%
    group_by(col1) %>%
    summarize(col6 = if_else(any(value == 4), "yes", "no")) %>%
    ungroup)
G. Grothendieck
  • 211,268
  • 15
  • 177
  • 297