-3

I split some data according to factors like:

 a <- factor(data$fact)
 b <- split(data,a)

Now I would like to add some of the factor together e.g.

tot <- b$A+b$B

but I'm getting the following error,

"sum" not meaningful for factors

Any help would be great.

aaa val1 val2 ...
aaa
bbb
bbb
ccc
ccc

Now if I split into factors I have three. But I want for example aaa and ccc to be considered together. This meas that the value in the other column should be summed up.

Thanks

tonytonov
  • 22,820
  • 16
  • 72
  • 92
  • 5
    A factor indicates that you are working with categorical values. It makes no sense to add (sum) categorical variables. Can you describe (in natural language, not code) what it is you are trying to do? – Andrie Jul 10 '12 at 11:01
  • I simply splitted the data into factors but I need to sum some of these together. Nothing else. –  Jul 10 '12 at 11:06
  • 2
    OK, I see. Now you have one list for `apples` and one list for `pears`. You still can't sum these together. Nothing else. – Andrie Jul 10 '12 at 11:11
  • Ok so if I have a list of apples, pear, banana and strawberry and at the end I want (apples+pear), banana and strawberry I can't do it. Clear –  Jul 10 '12 at 11:21
  • What do you mean by (apples+pears)? Do you want to have one factor instead of these two? Or maybe your factors are numbers? Give us an example of `b$A`, `b$B` and `tot` value that you wish to get. – Julius Vainora Jul 10 '12 at 11:33

2 Answers2

1

Create a new factor variable before splitting:

# Make up some data
df = data.frame(Cases = sample(LETTERS[1:5], 10, replace=TRUE),
                Set1 = 1:10, Set2 = 11:20)
# Duplicate your cases column
df$Cases_2 = df$Cases
# Create a new set of factor levels
levels(df$Cases_2) <- ifelse(levels(df$Cases_2) %in% c("A","B"), 
                             "AB", levels(df$Cases_2))
temp = split(df[-c(1, 4)], df$Cases_2)
temp
# $AB
#   Set1 Set2
# 3    3   13
# 5    5   15
# 6    6   16
# 8    8   18
# 
# $C
#   Set1 Set2
# 4    4   14
# 9    9   19
# 
# $D
#    Set1 Set2
# 2     2   12
# 7     7   17
# 10   10   20
# 
# $E
#   Set1 Set2
# 1    1   11

Then use lapply to calculate colSums:

lapply(temp, colSums)
# $AB
# Set1 Set2 
#   22   62 
# 
# $C
# Set1 Set2 
#   13   33 
# 
# $D
# Set1 Set2 
#   19   49 
# 
# $E
# Set1 Set2 
#    1   11
A5C1D2H2I1M1N2O1R2T1
  • 177,446
  • 27
  • 370
  • 450
0

You probably want to combine the resulting data frames using rbind:

tot <- rbind(b$A, b$B)
MvG
  • 51,562
  • 13
  • 126
  • 251