1

I have a matrix, I want to only keep those rows in which at least one member is more than 5. I mean those rows whose members all are less than 5 should be filtered out.

for example:

2 4 6 2 1
1 2 3 1 2
5 4 7 2 1

in this matrix, the second row should be filtered out because all of its members are less than 5.

here is what I write:

for(i in 1:length(matrix[,1]){
for(j in 2:17){
if(any(matrix[i,j]>=5)){matrix=matrix} else {matrix=matrix[-i,]}
}}

But it doesn't work.

what do you think I can do?

Pang
  • 8,605
  • 144
  • 77
  • 113
Fate
  • 43
  • 6

1 Answers1

0

Adapting some of the suggestions in this...

1) Identify which rows should be eliminated:

a<- read.table(text = "2 4 6 2 1
                       1 2 3 1 2
                       5 4 7 2 1")

a
     V1 V2 V3 V4 V5
[1,]  2  4  6  2  1
[2,]  1  2  3  1  2
[3,]  5  4  7  2  1

bye <- sapply(1:3, function(x){all(a[x,]<5)})
bye
[1] FALSE  TRUE FALSE

2) Use that to subset the matrix:

a2 <- a[!bye,]
a2
     V1 V2 V3 V4 V5
[1,]  2  4  6  2  1
[2,]  5  4  7  2  1
Community
  • 1
  • 1
paqmo
  • 3,439
  • 1
  • 9
  • 20
  • That sounds ok but when you have thousands of rows, this approach can be a bit difficult... What do you recommend in this case? – Fate Oct 14 '16 at 09:39
  • 1
    I tried it with a 5000x5000 matrix and it worked fine. A bit slow. Not sure what you mean by 'a bit difficult.' Maybe someone else has an idea for a more efficient solution. You could encapsulate the above in a function `filter – paqmo Oct 14 '16 at 13:25
  • 1
    One more thought! You could use the `filter()` function from the `dyplr` package. The command looks like this: `a %>% filter(rowSums(. >=n) >0)`, where n is the number you want to filter based on. Just make sure that `a` is a data frame. – paqmo Oct 15 '16 at 21:00
  • Thanks a lot... Your codes work well... The problem is my first column is character and the rest are numbers... So when I try to apply your code to my matrix, it can't handle the first column and then the results are not accurate... I have to keep the first number because it's the name of each subject which is important in further analysis... Could you please help me know what I can do? – Fate Oct 16 '16 at 13:04
  • I have to keep the first column* – Fate Oct 16 '16 at 13:04
  • A few of things you can do. (1)Turn the first row intro column names using `rownames(a) – paqmo Oct 16 '16 at 16:31