0

I have a df that looks something like this:

id <- c(1:8)
born.swis <-  c(0, 1, NA, NA, NA, 2, NA, NA)
born2005 <- c(NA, NA, 2, NA, NA, NA, NA, NA)
born2006 <- c(NA, NA, NA, 1, NA, NA, NA, NA)
born2007 <- c(NA, NA, NA, NA, NA, NA, NA, 1)
born2008 <- c(NA, NA, NA, NA, NA, NA, 2, NA)
born2009 <- c(NA, NA, NA, NA, NA, NA, NA, NA)
df <- data.frame(id, born.swis, born2005, born2006, born2007, born2008, born2009)

I'm trying to mutate born.swis based on the values of the other variables. Basically, I want the value bornswis to be filled with the value one of the other variables IF born.id is NA and IF it is not NA for that variable. Something like this:

id <- c(1:8)
born.swis <-  c(0, 1, 2, 1, NA, 2, 2,1)
df.desired <- data.frame(id, born.swis)

I tried several things with mutate and ifelse, like this:

df <- df%>%
  mutate(born.swis = ifelse(is.na(born.swis), born2005, NA,
                            ifelse(is.na(born.swis), born2006, NA,
                                   ifelse(is.na(born.swis), born2007, NA,
                                          ifelse(is.na(born.swis), born2008, NA,
                                                 ifelse(is.na(born.swis), born2009, NA,)
         )))))

and similar things, but I'm not able to reach my desired outcome.

Any ideas?

Many thanks!

AntVal
  • 415
  • 1
  • 9

3 Answers3

3

One dplyr option could be:

df %>%
 mutate(born.swis_res = coalesce(!!!select(., starts_with("born"))))

  id born.swis born2005 born2006 born2007 born2008 born2009 born.swis_res
1  1         0       NA       NA       NA       NA       NA             0
2  2         1       NA       NA       NA       NA       NA             1
3  3        NA        2       NA       NA       NA       NA             2
4  4        NA       NA        1       NA       NA       NA             1
5  5        NA       NA       NA       NA       NA       NA            NA
6  6         2       NA       NA       NA       NA       NA             2
7  7        NA       NA       NA       NA        2       NA             2
8  8        NA       NA       NA        1       NA       NA             1

Or with dplyr 1.0.0:

df %>%
 mutate(born.swis_res = Reduce(coalesce, across(starts_with("born"))))
tmfmnk
  • 31,986
  • 3
  • 26
  • 41
1

In base R, you can use max.col :

df[cbind(1:nrow(df), max.col(!is.na(df[-1])) + 1 )]
#[1]  0  1  2  1 NA  2  2  1

max.col gives the column position of the first non-NA value in each row (exlcuding first column), we create a matrix with row-index and use it to subset df.

Ronak Shah
  • 286,338
  • 16
  • 97
  • 143
1

base R

df$born.swis <- apply(df[-1], 1, function(x) ifelse(all(is.na(x)), NA, sum(x, na.rm = T)))
Yuriy Saraykin
  • 4,182
  • 1
  • 4
  • 9