1

Given a data.table

DT<-data.table(a=c(1,2,NA,4,5), b=c(2,3,4,NA,5),c=c(1,2,3,4,5),d=c(2,3,4,5,6))

how can I do the equivalent of

DT[!is.na(a) & !is.na(b) & !is.na(c) & !is.na(d)]

in a general form without knowing any of the column names or typing out the !is.na() for each individual column.

I could also do

DT[apply(DT,1,function(x) !any(is.na(x)))] but I'm wondering if there's a better way still.

Frank
  • 63,401
  • 8
  • 85
  • 161
Dean MacGregor
  • 7,102
  • 6
  • 30
  • 60

3 Answers3

8

I think you are looking for complete.cases:

> DT[complete.cases(DT),]
   a b c d
1: 1 2 1 2
2: 2 3 2 3
3: 5 5 5 6
Spacedman
  • 86,225
  • 12
  • 117
  • 197
  • 5
    Or perhaps `na.omit(DT)` – talat Nov 17 '15 at 20:45
  • @docendodiscimus not sure if you suggested `na.omit` with this in mind but after further review it looks like there's a `na.omit` function within `data.table` as opposed to just the base function. In any event, it didn't even occur to me to use that function even though it is completely obvious in retrospect. – Dean MacGregor Nov 17 '15 at 21:17
  • 1
    @DeanMacGregor, when I posted the comment I had a feeling that `na.omit` is optimized for `data.table` but wasn't really sure (and didn't bother looking it up). – talat Nov 17 '15 at 21:20
  • 3
    Yes, `na.omit` has `data.table` method. See `methods(na.omit)`. It also has a `col` argument where you can specify according to which columns you want to omit `NA`s. I'd guess that would be the perfect choice here. – David Arenburg Nov 17 '15 at 21:22
1

From the comments of @docendodiscimus

data.table has an na.omit method which is optimized for data.tables

Dean MacGregor
  • 7,102
  • 6
  • 30
  • 60
-1

You could do this:

DT[!is.na(rowSums(DT)),]
wolfste4
  • 39
  • 7
  • This only works if all columns are numeric/integer/logical, wouldn't work if any column was `character` or `factor`. – talat Nov 17 '15 at 21:13