I have a sparse matrix structured similar to this, but much larger.
library(Matrix)
dfmtest<-new("dgCMatrix"
, i = c(0L, 1L, 2L, 4L, 5L, 6L, 8L, 0L, 1L, 2L, 3L, 4L, 6L, 7L, 8L,
0L, 2L, 3L, 6L, 7L, 8L, 1L, 2L, 4L, 5L, 6L, 7L, 8L, 9L, 0L, 1L,
3L, 4L, 5L, 6L, 7L, 8L, 9L, 0L, 1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L,
9L, 0L, 1L, 3L, 4L, 6L, 7L, 8L, 9L, 0L, 2L, 3L, 5L, 6L, 7L, 9L,
0L, 1L, 2L, 3L, 4L, 5L, 6L, 8L, 9L, 0L, 1L, 2L, 3L, 4L, 5L, 6L,
7L, 9L)
, p = c(0L, 7L, 15L, 21L, 29L, 38L, 48L, 56L, 63L, 72L, 81L)
, Dim = c(10L, 10L)
, Dimnames = list(NULL, NULL)
, x = c(4, 3, 1, 2, 3, 1, 2, 1, 3, 3, 2, 3, 3, 3, 4, 2, 1, 2, 3, 2,
1, 4, 1, 2, 2, 3, 2, 3, 4, 1, 4, 1, 3, 4, 3, 2, 2, 2, 4, 1, 2,
2, 1, 2, 3, 1, 1, 1, 4, 1, 1, 2, 1, 1, 1, 4, 3, 3, 2, 1, 2, 2,
1, 1, 3, 3, 4, 1, 2, 4, 2, 4, 1, 2, 2, 3, 4, 2, 1, 2, 4)
, factors = list()
)
I would like to be able to find the mean of each column (and row eventually), excluding the 0 values. If I attempt to do it manually I run into memory issues because of the size of my sparse matrix.
nzmean <- function(x) {
mean(x[x!=0])
}
dfmmeans <- apply(dfmtest, 2, nzmean)
# 1 2 3 4 5 6 7 8
#2.285714 2.750000 1.833333 2.625000 2.444444 1.800000 1.875000 2.000000
# 9 10
#2.666667 2.333333
When I run the above on my actual matrix I get the following error:
Error in asMethod(object) :
Cholmod error 'problem too large' at file ../Core/cholmod_dense.c, line 105
I have also looked into using the colMeans
function, but it looks as though it is including all 0 values in the calculation.
dfmmeans <- colMeans(dfmtest)
#[1] 1.6 2.2 1.1 2.1 2.2 1.8 1.5 1.4 2.4 2.1
Is there a good way to do this on a large sparse matrix?