I have a large dataset for which I want to create 50 new variables where the values are conditional on values in previous columns, and the name of the variables reflect this fact. To make it more intelligible, here is an example:
df <- tibble("a" = runif(10,1990,2000),
"event" = 1995) %>%
mutate("relative_event" = a - event)
Now with this dataset I would like to create dummy variables that code if the specific observation is one year prior to the event, 2 year prior, etc, as well as forward. One clumsy way to do this (which works) is:
df <- df %>%
mutate("event_b1" = ifelse( (relative_event<=0) & (relative_event > -1),1,0)) %>%
mutate("event_b2" = ifelse( (relative_event<=-1) & (relative_event > -2),1,0)) %>% #etc with more lagx
mutate("event_f1" = ifelse( (relative_event>0) & (relative_event < 1),1,0)) %>%
mutate("event_f2" = ifelse( (relative_event>1) & (relative_event < 2 ),1,0)) #etc with more forward
where b1 is for "one year before" and f2 is for "2 years forward". The result looks like this:
A tibble: 10 x 7
a event relative_event event_b1 event_b2 event_f1 event_f2
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1993. 1995 -1.94 0 1 0 0
2 1992. 1995 -2.59 0 0 0 0
3 2000. 1995 4.75 0 0 0 0
4 1998. 1995 3.25 0 0 0 0
5 1991. 1995 -3.88 0 0 0 0
6 1992. 1995 -3.02 0 0 0 0
7 1996. 1995 1.08 0 0 0 1
8 1994. 1995 -1.04 0 1 0 0
9 1993. 1995 -2.22 0 0 0 0
10 1995. 1995 -0.302 1 0 0 0
Since I have more than 50 columns to create I would like to know how to do it automatically so that I don't have to copy-paste 49 times and manually change the condition and the variable name. I spent time looking on SO on this thread, this one and on CV as well but I am still clueless. I tried the following code which does not work:
for (i in 0:10) {
if (i<0) {
event_bi <- paste0("event_b",i)
df <- df %>%
mutate(get(event_bi) = ifelse((relative_event<=-(i-1)) & (relative_event>-i),1,0))
}
}
Ideally I'd like to learn how to do it with dplyr but if there is an obvious Base R solution I'm happy to learn it as well.
Thanks!