1

There are a few questions regarding something similar such as

Subsetting R data frame results in mysterious NA rows

However they don't answer my question because (1) I do not understand what this whole "If your code is analogous to this example (of the form d[d$v == x, ], your problem is indeed almost certainly NA`s in your column. " because example below demonstrates this is not the case.

For example like:

iris
gsub(1.8, NA, iris$Petal.Width)
iris[iris$Petal.Width == 2.0,]

generates

  Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
111          6.5         3.2          5.1           2 virginica
114          5.7         2.5          5.0           2 virginica
122          5.6         2.8          4.9           2 virginica
123          7.7         2.8          6.7           2 virginica
132          7.9         3.8          6.4           2 virginica
148          6.5         3.0          5.2           2 virginica

Clearly no mysterious NA row appears despite having plenty of NA in the referenced column

My problem is that I am currently subsetting in the form df[df$var==x,] - However currently doing so is giving back to me a NA row everytime in the form:

NA   NA  <NA> <NA> <NA>     <NA>            <NA> <NA> <NA>  NA 

I would give an actual reproducible example - however the spreadsheet is confidential.

Community
  • 1
  • 1
PyPer User
  • 239
  • 2
  • 8

1 Answers1

2

using your example (which doesnt show any NAs because you forgot to reassign the variable):

iris
iris$Petal.Width <- gsub(1.8, NA, iris$Petal.Width)
iris[!is.na(iris$Petal.Width) & iris$Petal.Width == 2.0,]

this also works:

iris[complete.cases(iris$Petal.Width) & iris$Petal.Width== 2 ,]

which gives the following output:

    Sepal.Length Sepal.Width Petal.Length Petal.Width   Species
111          6.5         3.2          5.1           2 virginica
114          5.7         2.5          5.0           2 virginica
122          5.6         2.8          4.9           2 virginica
123          7.7         2.8          6.7           2 virginica
132          7.9         3.8          6.4           2 virginica
148          6.5         3.0          5.2           2 virginica

read those links as an introduction to NAs in R: http://www.statmethods.net/input/missingdata.html http://www.ats.ucla.edu/stat/r/faq/missing.htm

grrgrrbla
  • 2,309
  • 1
  • 12
  • 28