I am trying to split a large data frame into smaller data frames based on a number of conditions in R. I would like each of these data frames to be named according to the variables on which they were split, but as there are over 1660 new "sub" data frames, I cannot do this manually.
An example of the whole data frame:
ID LENGTH GRADE CODE DURATION STATUS
1 1 A1 ABC Less than 10 Y
2 2 A1 ABC More than 10 Y
3 1 A1 DEF Less than 10 Y
4 2 A2 ABC Less than 10 Y
5 1 B1 ABC More than 10 Y
6 3 B2 DEF Less than 10 Y
There are over 900,000 entries to be split by 7 variables into about 1660 non-empty groups - I have found this by creating a new grouped dataframe
> Grouped_DF<- DF %>% group_by(LENGTH,GRADE,CODE,DURATION,STATUS,...)
> nrow(Grouped_Data)
[1] 1660
which consists of the groups I desire, but now I want to create a new dataframe for each of these groups, with all of the entries that fall into each group. I have tried using the split function:
SplitGroups<-split(DF, with(DF, interaction(LENGTH,GRADE,CODE,DURATION,STATUS,..)))
Which generates the following list:
> class(SplitGroups)
[1] "list"
> length(SplitGroups)
[1] 24480
An example of the output:
> SplitGroups
$1.A1.ABC.Less Than 10.N`
# A tibble: 10 x 65
# Groups: ID [10]
# ... with 65 variables:
Now I want to take the non empty dataframes, rename them as, for example, '1.A1.ABC.Less Than 10.N' (or something similar) and store this into the global environment.
I am aware this could be done using subset, for example:
1.A1.ABC.LessThan10.N <- subset(DF, LENGTH==1 & GRADE=="A1" & CODE=="ABC" & .....)
and so on, but this is not practical for the number of subsets needed.
Any help would be appreciated, thanks.