Subsetting rows in R generates mysterious NA row [Version 2.0]

Question

There are a few questions regarding something similar such as

Subsetting R data frame results in mysterious NA rows

However they don't answer my question because (1) I do not understand what this whole "If your code is analogous to this example (of the form d[d$v == x, ], your problem is indeed almost certainly NA`s in your column. " because example below demonstrates this is not the case.

For example like:

iris
gsub(1.8, NA, iris$Petal.Width)
iris[iris$Petal.Width == 2.0,]

generates

  Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
111          6.5         3.2          5.1           2 virginica
114          5.7         2.5          5.0           2 virginica
122          5.6         2.8          4.9           2 virginica
123          7.7         2.8          6.7           2 virginica
132          7.9         3.8          6.4           2 virginica
148          6.5         3.0          5.2           2 virginica

Clearly no mysterious NA row appears despite having plenty of NA in the referenced column

My problem is that I am currently subsetting in the form df[df$var==x,] - However currently doing so is giving back to me a NA row everytime in the form:

NA   NA  <NA> <NA> <NA>     <NA>            <NA> <NA> <NA>  NA

I would give an actual reproducible example - however the spreadsheet is confidential.

I've deleted the rows with NA in them and this problem continues — PyPer User, Jun 12 '15 at 07:56
Try `iris[with(iris, Petal.Width==2 & !is.na(Petal.Width)),]` — akrun, Jun 12 '15 at 07:59

grrgrrbla · Accepted Answer · 2015-06-12T08:01:56.263

using your example (which doesnt show any NAs because you forgot to reassign the variable):

iris
iris$Petal.Width <- gsub(1.8, NA, iris$Petal.Width)
iris[!is.na(iris$Petal.Width) & iris$Petal.Width == 2.0,]

this also works:

iris[complete.cases(iris$Petal.Width) & iris$Petal.Width== 2 ,]

which gives the following output:

    Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
111          6.5         3.2          5.1           2 virginica
114          5.7         2.5          5.0           2 virginica
122          5.6         2.8          4.9           2 virginica
123          7.7         2.8          6.7           2 virginica
132          7.9         3.8          6.4           2 virginica
148          6.5         3.0          5.2           2 virginica

read those links as an introduction to NAs in R: http://www.statmethods.net/input/missingdata.html http://www.ats.ucla.edu/stat/r/faq/missing.htm

My bad. Completely forgot about complete.cases() and simply searched using =="NA". Thanks for the help — PyPer User, Jun 12 '15 at 08:12
no problem, but read about NAs they require a treatment that is very different from other values in R — grrgrrbla, Jun 12 '15 at 08:14

Subsetting rows in R generates mysterious NA row [Version 2.0]

1 Answers1