1

I have 32 dataframes, I need to obtain for every dataframe a new dataframe containing the sum of some of other dataframes' column.

Let me write an example with 2 dataframes to be more clear:

df1 <- data.frame(1:5,2:6,3:7, 4:8)
colnames(df1) <- c("one", "two", "three", "four")
df2 <- data.frame(4:8, 5:9, 6:10, 7:11)
colnames(df2) <- c("one", "two", "three", "four")

What I would like to obtain is a dataframe df1a, in which column 1 is the sum of columns 1 and 3 of dataframe df1, and column 2 is the same, not changing. Also I would like that column 4, in the output is placed first.

I know I can write this code:

df1a <- data.frame(df1$four, df1$one+df1$three, df1$two )
colnames(df1a) <- c("four", "1+3", "two")

But It seems to me very long to write for every dataframe, since in my real data I have 32 dataframes made of 20 columns each.

I put them in a list:

listdf <- list(df1, df2) 

I think I have to apply some loop or something with apply, but I can't figure how.

An example of what I would like to obtain from df1 to df1a:

df1
  one two three four
1   1   2     3    4
2   2   3     4    5
3   3   4     5    6
4   4   5     6    7
5   5   6     7    8

df1a <- data.frame(df1$four, df1$one+df1$three, df1$two )
colnames(df1a) <- c("four", "1+3", "two")
df1a
  four 1+3 two
1    4   4   2
2    5   6   3
3    6   8   4
4    7  10   5
5    8  12   6
Francesco
  • 59
  • 2
  • 8
  • 1
    See gregor's answer in [this post](http://stackoverflow.com/questions/17499013/how-do-i-make-a-list-of-data-frames) on working with lists of data.frames. My answer there gives a nice shortcut for retrieving a named list of data.frames. – lmo Sep 21 '16 at 11:24
  • 1
    @RonakShah, added an example – Francesco Sep 21 '16 at 11:41

1 Answers1

1

See comments in the code. In essence, you write a function which should be performed on each data.frame and use it lapply or sapply to perform this operation on each data.frame. Since you put these data.frames into a list, use of lapply or sapply is very convenient.

df1 <- data.frame(1:5,2:6,3:7, 4:8)
colnames(df1) <- c("one", "two", "three", "four")
df2 <- data.frame(4:8, 5:9, 6:10, 7:11)
colnames(df2) <- c("one", "two", "three", "four")

# Create a function which holds commands to be used on a single data.frame
operationsPerDF <- function(x) {
  data.frame(four = x$four, onepthree = x$one + x$three, two = x$two)
}

# You can manually gather data.frames into a list.
lapply(list(df1, df2), FUN = operationsPerDF)

# Or find data.frames by a pattern, collect them into a list...
list.dfs <- sapply(ls(pattern = "df"), get, simplify = FALSE)

# ... and perform the above operation, one data.frame at a time
lapply(list.dfs, FUN = operationsPerDF)

$df1
  four onepthree two
1    4         4   2
2    5         6   3
3    6         8   4
4    7        10   5
5    8        12   6

$df2
  four onepthree two
1    7        10   5
2    8        12   6
3    9        14   7
4   10        16   8
5   11        18   9
Roman Luštrik
  • 64,404
  • 24
  • 143
  • 187