The issue of dropping unused factor levels when subsetting has come up before. Common solutions include using character vectors where possible by declaring
options(stringsAsFactors = FALSE)
Sometimes, though, ordered factors are necessary for plotting, in which case we can use convenience functions like droplevels
to create a wrapper for subset
:
subsetDrop <- function(...){droplevels(subset(...))}
I realize that subsetDrop
mostly solves this problem, but there are some situations where subsetting via [
is more convenient (and less typing!).
My question is how much further, for the sake of convenience, can we push this to be the 'default' behavior of R by overriding [
for data frames to automatically drop factor levels. For instance, the Hmisc package contains dropUnusedLevels
which overrides [.factor
for subsetting a single factor (which is no longer necessary, since the default [.factor
appears to have a drop
argument for dropping unused levels). I'm looking for a similar solution that would allow me to subset data frames using [
but automatically dropping unused factor levels (and of course preserving order in the case of ordered factors).