1

My data frame looks like this:

Stage  Var1   var2  Var1   var2
A        1      11    9     12
A        2      NA    3     13
A       NA      NA    2     10
B        4      14    1      4
B       NA      NA    4      2
B        6      16    6      8
B        7      17  100      9
C        8      NA    4      6
C        9      19   34     12
C       10      NA    5     18
C        1       0    6      3

I would like to split the dataframe using ddply, apply mean() for each group. Later it has to be looped for all the columns. Hence i am trying something like this:

for(i in names(NewInput)){
NewInput[[i]] <- ddply(NewInput , "Model_Stage", function(x) {
mean.Cycle2 <- mean(x$NewInput[[i]])
})
}

The above code works fine without for loop (i.e) ddply works fine with one variable. However when I run through columns using for loop i am getting several warnings

In loop_apply(n, do.ply):argument is not numeric or logical: returning           NA                                            

Question:

-> How to loop through ddply over all the variables using for loop?

-> Is it possible to use apply()?

Thank you.

-Chris

Chris
  • 45
  • 3

1 Answers1

1

You can try

library(plyr)
ddply(df1, .(Stage), colwise(mean, na.rm=TRUE))

Other options include

library(dplyr)
df1 %>%
     group_by(Stage) %>%
     summarise_each(funs(mean=mean(., na.rm=TRUE)))

Or

library(data.table)
setDT(df1)[, lapply(.SD, mean, na.rm=TRUE), Stage]

Or using base R

aggregate(.~Stage, df1, FUN=mean, na.rm=TRUE, na.action=NULL)
akrun
  • 674,427
  • 24
  • 381
  • 486