Aggregate ‘sum’ not meaningful for factors in R

Question

I need to aggregate the same values in col2 and col3, so I expect to recieve SUM in col4 and col5:

df <- data.frame("col1"="a", "col2"=c("mi", "se", "mi", "se", "ty"), 
                 "col3"=c("re", "my", "re", "my", "my"), "col4"=c(1, 2, 3, 4, 5), 
                 "col5"=c(1, 2, 3, 4, 5))
agg <- aggregate(df, by=list(df$col1, df$col2), FUN=sum)

The result is an error, though:

Error in Summary.factor(c(1L, 1L), na.rm = FALSE) : ‘sum’ not meaningful for factors

My expected output is

  col1 col2 col3 col4 col5
1    a   mi   re    4    4
2    a   se   my    6    6
3    a   ty   my    5    5

you are applying it incorrectly. Read `?aggregate`, maybe you need `aggregate(col5~col1 + col3, df, sum)` — Ronak Shah, Apr 04 '19 at 07:23
@RonakShah sorry, I mean "aggregate the same values in col2 and col3" — petrov_petrovich, Apr 04 '19 at 07:29
What are you expecting? Add the expected output based on your example because words difficult. — MrGumble, Apr 04 '19 at 07:31
@petrov_petrovich Please add expected outputs into the question. This time I've helped you. — jay.sf, Apr 04 '19 at 08:08
Possible duplicate https://stackoverflow.com/questions/9723208/aggregate-summarize-multiple-variables-per-group-e-g-sum-mean — Ronak Shah, Apr 04 '19 at 08:28

Haezer · Accepted Answer · 2019-04-04T08:23:29.413

1

Using dplyr :

agg <- df %>% 
  group_by(col2, col3) %>% 
  summarise(col4 = sum(col4),
            col5 = sum(col5))

#   col2  col3   col4  col5
#   <fct> <fct> <dbl> <dbl>
# 1 mi    re        4     4
# 2 se    my        6     6
# 3 ty    my        5     5

Is that what you are looking for ?

edited Apr 04 '19 at 08:23

answered Apr 04 '19 at 07:38

Haezer

355
1
10

jay.sf · Answer 2 · 2019-04-04T07:52:17.630

Exclude factor columns by aggregating on list(col4, col5).

with(df, aggregate(list(col4, col5), by=list(col1, col2, col3), sum))
#   Group.1 Group.2 Group.3 c.1..2..3..4..5. c.1..2..3..4..5..1
# 1       a      se      my                6                  6
# 2       a      ty      my                5                  5
# 3       a      mi      re                4                  4

We can get a somewhat nicer output if we name the lists.

with(df, aggregate(list(col4=col4, col5=col5), by=list(col1=col1, col2=col2, col3=col3), sum))
#   col1 col2 col3 col4 col5
# 1    a   se   my    6    6
# 2    a   ty   my    5    5
# 3    a   mi   re    4    4

As suggested by @Ronak Shah we also could do

aggregate(cbind(col4, col5) ~ col1 + col2 + col3, df, sum)

The list method is slightly faster, though.

Aggregate ‘sum’ not meaningful for factors in R

2 Answers2