-1

In R how to delete rows that have missing values for all variables?. I want to keep the rest of the rows that have records with some missing values. I have tried the code posted here previously and it is not working.

IanM
  • 1

3 Answers3

0

mydf is your data table

mydf[complete.cases(mydf)]

mydf is your data frame

subset(mydf,complete.cases(mydf))
Max
  • 660
  • 4
  • 10
0

If df is your data table.

df[rowSums(is.na(df))!=ncol(df), ]

TEST

> df <- matrix(c(1, NA, 2, 3, 4, NA, NA, 6, 7, NA, NA, NA),c(4, 3))
> df
     [,1] [,2] [,3]
[1,]    1    4    7
[2,]   NA   NA   NA
[3,]    2   NA   NA
[4,]    3    6   NA
> df[rowSums(is.na(df))!=ncol(df), ]
     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2   NA   NA
[3,]    3    6   NA
Song Zhengyi
  • 251
  • 2
  • 6
0

If I understand the question correctly, you do not want to remove the rows with any missing data, which is what complete.cases does, but those with all values missing.

library(tibble)
df <- tribble(
    ~x, ~y, ~z,
     1, NA,  2,
    NA, NA, NA, # remove this row
    NA,  3,  4,
    5, NA, NA
)

Here we want to remove the second row and only the second.

You can apply over the table and get a boolean for whether all values are missing like this:

to_remove <- apply(
    df, 1, # map over rows
    function(row) all(is.na(row))
)

and then you can keep those where to_remove is FALSE.

df[!to_remove,]
Thomas Mailund
  • 707
  • 4
  • 14