I would like to select a part of the data with dplyr
to carry out an operation on, but without it saving the selection on which the operation has been carried out. My database looks as follows:
country country-year year a b
1 France France2000 2000 NA NA
2 France France2001 2001 1000 1000
3 France France2002 2002 NA NA
4 France France2003 2003 1600 2200
5 France France2004 2004 NA NA
6 UK UK2000 2000 1000 1000
7 UK UK2001 2001 NA NA
8 UK UK2002 2002 NA NA
9 UK UK2003 2003 NA NA
10 UK UK2004 2004 NA NA
11 Germany UK2000 2000 NA NA
12 Germany UK2001 2001 NA NA
13 Germany UK2002 2002 NA NA
14 Germany UK2003 2003 NA NA
15 Germany UK2004 2004 NA NA
As an example:
# I first select the group
df <- df %>%
group_by(country)%>%
For this group, I want to interpolate (only interpolate!) when there is more than 1 observation, but I do not want to remove the groups where there are only 1 or less observation.
I was wondering if I can select countries where n>1
and only for those group carry out the operation:
mutate_at(vars(a:b),~na.fill(.x,c(NA, "extend", NA)))
I also thought about the following, but I cannot get the syntax right:
mutate_if(is.numeric,~if(n()>1 NA else na.fill(.x,c(NA, "extend", NA)))
The desired result would be:
country country-year year a b
1 France France2000 2000 NA NA
2 France France2001 2001 1000 1000
3 France France2002 2002 **1300****1600**
4 France France2003 2003 1600 2200
5 France France2004 2004 NA NA
6 UK UK2000 2000 1000 1000
7 UK UK2001 2001 NA NA
8 UK UK2002 2002 NA NA
9 UK UK2003 2003 NA NA
10 UK UK2004 2004 NA NA
11 Germany UK2000 2000 NA NA
12 Germany UK2001 2001 NA NA
13 Germany UK2002 2002 NA NA
14 Germany UK2003 2003 NA NA
15 Germany UK2004 2004 NA NA
Any suggestions?