9

I am analyzing a set of data with many columns (almost 30 columns). I want to group data based on two columns and apply sum and mean functions to all the columns except timestamp. How would I use summarise_each on all columns except timestamp?

This is the draft code I have but it obviously not correct. Plus it generates and error because it can not apply Sum to POSIXt data type (Error: 'sum' not defined for "POSIXt" objects)

features <- dataset %>% 
  group_by(X, Y) %>% 
  summarise_each(funs(mean,sum)) %>%
  arrange(TIMESTAMP)
Uwe
  • 34,565
  • 10
  • 75
  • 109
Behrad3d
  • 429
  • 4
  • 13

1 Answers1

19

Try summarise_each(funs(mean,sum), -TIMESTAMP) to exclude TIMESTAMP from the summarisation.

Alex Ioannides
  • 1,099
  • 8
  • 10
  • 4
    why does this not work for the current function `summarise_all`? – HNSKD Jun 02 '18 at 10:55
  • 1
    try -c(TIMESTAMP) @HNSKD – Union find Jun 06 '18 at 15:57
  • Unfortunately, I cannot add another answer. I think it this question was closed for a bad reason; the answer you're looking for is not on the referenced page. Anyway, for the new `dplyr` (>= 0.8.0) you need to use `summarise_at(vars(-TIMESTAMP), ~mean)` to summarise on all but the TIMESTAMP variable. – MS Berends Dec 20 '19 at 08:54