0

I want to create a column in my dataframe where I could sum two other column of the dataframe.

df <- data.frame(x = 1:10, y = 11:20)

Here for example, I want to add a third column z where I have :

z <- c(12, 14, 16,..., 30)

Thanks in advance.

Software Engineer
  • 13,509
  • 5
  • 57
  • 83
Benmoshe
  • 1,529
  • 16
  • 35

5 Answers5

7

The function rowSums will do the trick:

df$z <- rowSums(df)

The result:

    x  y  z
1   1 11 12
2   2 12 14
3   3 13 16
4   4 14 18
5   5 15 20
6   6 16 22
7   7 17 24
8   8 18 26
9   9 19 28
10 10 20 30
Sven Hohenstein
  • 75,536
  • 15
  • 130
  • 155
6

Arithmetic in R is vectorized. That's a hugely important concept you should read up on. Columns in data frames are vectors, so your solution is simply:

df$z <- df$x + df$y
df$z
## [1] 12 14 16 18 20 22 24 26 28 30

the same as if you just had x & y standalone vectors:

x <- 1:10
y <- 11:20
x + y
## [1] 12 14 16 18 20 22 24 26 28 30
hrbrmstr
  • 71,487
  • 11
  • 119
  • 180
2
df <- data.frame(x = 1:10, y = 11:20)
df$z<-c(df$x+df$y)
df

    x  y  z
1   1 11 12
2   2 12 14
3   3 13 16
4   4 14 18
5   5 15 20
6   6 16 22
7   7 17 24
8   8 18 26
9   9 19 28
10 10 20 30
Olli J
  • 599
  • 2
  • 6
  • 22
2

Using data.table :

> library(data.table)
> setDT(df)[,z:=x+y,]
> df
     x  y  z
 1:  1 11 12
 2:  2 12 14
 3:  3 13 16
 4:  4 14 18
 5:  5 15 20
 6:  6 16 22
 7:  7 17 24
 8:  8 18 26
 9:  9 19 28
10: 10 20 30
rnso
  • 20,794
  • 19
  • 81
  • 167
1

Using dplyr:

library(dplyr)
df %>% group_by(x) %>% mutate(z = sum(x+y))

Other two options I've learnt in this answer to avoid grouping and writing the columns names Sum across multiple columns with dplyr

df %>% mutate(z = Reduce(`+`, .))
df %>% mutate(z = rowSums(.))

Output:

Source: local data frame [10 x 3]
Groups: x

    x  y  z
1   1 11 12
2   2 12 14
3   3 13 16
4   4 14 18
5   5 15 20
6   6 16 22
7   7 17 24
8   8 18 26
9   9 19 28
10 10 20 30
Community
  • 1
  • 1
mpalanco
  • 10,839
  • 1
  • 53
  • 63
  • Why do you need to group by `x`? – Rich Scriven Jul 16 '15 at 16:42
  • @Richard Scriven If I don't group by x, `df %>% mutate(z = sum(x+y))`, the z column will be 210 in each row (the sum of df$x+df$y). I could use `df %>% mutate(z = rowSums(.))` or `df %>% mutate(z = Reduce(+, .))` to avoid grouping. I'll add them to my answer. – mpalanco Jul 16 '15 at 17:00