How to filter rows out of data.table where any column is NA without specifying columns individually

Question

Given a data.table

DT<-data.table(a=c(1,2,NA,4,5), b=c(2,3,4,NA,5),c=c(1,2,3,4,5),d=c(2,3,4,5,6))

how can I do the equivalent of

DT[!is.na(a) & !is.na(b) & !is.na(c) & !is.na(d)]

in a general form without knowing any of the column names or typing out the !is.na() for each individual column.

I could also do

DT[apply(DT,1,function(x) !any(is.na(x)))] but I'm wondering if there's a better way still.

Take a look here as well http://stackoverflow.com/questions/4862178/remove-rows-with-nas-in-data-frame — David Arenburg, Nov 17 '15 at 21:17

score 8 · Answer 1 · answered Nov 17 '15 at 20:42

8

I think you are looking for complete.cases:

> DT[complete.cases(DT),]
   a b c d
1: 1 2 1 2
2: 2 3 2 3
3: 5 5 5 6

answered Nov 17 '15 at 20:42

Spacedman

5

Or perhaps `na.omit(DT)` – talat Nov 17 '15 at 20:45
@docendodiscimus not sure if you suggested `na.omit` with this in mind but after further review it looks like there's a `na.omit` function within `data.table` as opposed to just the base function. In any event, it didn't even occur to me to use that function even though it is completely obvious in retrospect. – Dean MacGregor Nov 17 '15 at 21:17
1

@DeanMacGregor, when I posted the comment I had a feeling that `na.omit` is optimized for `data.table` but wasn't really sure (and didn't bother looking it up). – talat Nov 17 '15 at 21:20
3

Yes, `na.omit` has `data.table` method. See `methods(na.omit)`. It also has a `col` argument where you can specify according to which columns you want to omit `NA`s. I'd guess that would be the perfect choice here. – David Arenburg Nov 17 '15 at 21:22

score 1 · Accepted Answer · answered Nov 24 '15 at 15:51

1

From the comments of @docendodiscimus

data.table has an na.omit method which is optimized for data.tables

answered Nov 24 '15 at 15:51

Dean MacGregor

score -1 · Answer 3 · answered Nov 17 '15 at 20:46

-1

You could do this:

DT[!is.na(rowSums(DT)),]

answered Nov 17 '15 at 20:46

wolfste4

This only works if all columns are numeric/integer/logical, wouldn't work if any column was `character` or `factor`. – talat Nov 17 '15 at 21:13

3 Answers3