9

I remember a comment on r-help in 2001 saying that drop = TRUE in [.data.frame was the worst design decision in R history.

dplyr corrects that and does not drop implicitly. When trying to convert old code to dplyr style, this introduces some nasty bugs when d[, 1] or d[1] is assumed a vector.

My current workaround uses unlist as shown below to obtain a 1-column vector. Any better ideas?

library(dplyr)

d2 = data.frame(x = 1:5, y = (1:5) ^ 2)
str(d2[,1]) # implicit drop = TRUE
# int [1:5] 1 2 3 4 5

str(d2[,1, drop = FALSE])
# data.frame':  5 obs. of  1 variable:
#  $ x: int  1 2 3 4 5

# With dplyr functions
d1 = data_frame(x = 1:5, y = x ^ 2)
str(d1[,1])
# Classes ‘tbl_df’ and 'data.frame':    5 obs. of  1 variable:
#  $ x: int  1 2 3 4 5

str(unlist(d1[,1]))
# This ugly construct gives the same as str(d2[,1])
str(d1[,1][[1]])
josliber
  • 41,865
  • 12
  • 88
  • 126
Dieter Menne
  • 9,643
  • 39
  • 65
  • 2
    Why not just use `d1[[1]]` – shadow Jun 11 '15 at 13:46
  • Works too. Got lost in [[]] space. Please post as answer as a reference. I have posted this summary because I could not find it on SO and in the docs. If someone found a caveat in the `dplyr` docs, please add link here. – Dieter Menne Jun 11 '15 at 13:52

1 Answers1

7

You can just use the [[ extract function instead of [.

d1[[1]]
## [1] 1 2 3 4 5

If you use a lot of piping with dplyr, you may also want to use the convenience functions extract and extract2 from the magrittr package:

d1 %>% magrittr::extract(1) %>% str
## Classes ‘tbl_df’ and 'data.frame':  5 obs. of  1 variable:
##   $ x: int  1 2 3 4 5
d1 %>% magrittr::extract2(1) %>% str
##  int [1:5] 1 2 3 4 5

Or if extract is too verbose for you, you can just use [ directly in the pipe:

d1 %>% `[`(1) %>% str
## Classes ‘tbl_df’ and 'data.frame':  5 obs. of  1 variable:
##   $ x: int  1 2 3 4 5
d1 %>% `[[`(1) %>% str
##  int [1:5] 1 2 3 4 5
shadow
  • 20,147
  • 4
  • 49
  • 71
  • `magrittr::extract(1)` has been mentioned in another SO thread. It's a bit verbose and not very convenient when you need several columns. – Dieter Menne Jun 11 '15 at 13:59
  • `[[`(1): nice demonstration, but one of the reasons why many people think R is second to Perl as a write-only language. – Dieter Menne Jun 11 '15 at 14:26
  • Where do `\`[\`(1)` and `\`[[\`(1)` come from, i.e. where can I find some documentation on this? And why would I have to put a named vector in quotes when I want to use extract2, i.e. `extract2("x") – dpprdan Aug 12 '16 at 11:49
  • @dapperdan: They are the regular extraction functions for data.frames. You can get help like usual with ?\`[\` or ?\`[[\` or better with ?\`[.data.frame\` to get the data.frame-specific help. – shadow Aug 12 '16 at 11:50
  • Thanks, that reply was faster than me fumbling with the formatting of my comment. I am somewhat familiar with the extraction functions, as in df["x"] and df[["x"]], (or rather am just learning this in the context of Hadley's world). What I cannot find in the docs is the use of `[[`(1) vs [[1]]. I.e. why one cannot use the latter in a pipe as opposed to the former, for example – dpprdan Aug 12 '16 at 12:21