12

I have a character data frame in R which has NaNs in it. I need to remove any row with a NaN and then convert it to a numeric data frame.

If I just do as.numeric on the data frame, I run into the following

Error: (list) object cannot be coerced to type 'double'
 1:
 0:
Brian Tompsett - 汤莱恩
  • 5,195
  • 62
  • 50
  • 120
ganesh reddy
  • 1,492
  • 7
  • 19
  • 35

2 Answers2

20

As @thijs van den bergh points you to,

dat <- data.frame(x=c("NaN","2"),y=c("NaN","3"),stringsAsFactors=FALSE)

dat <- as.data.frame(sapply(dat, as.numeric)) #<- sapply is here

dat[complete.cases(dat), ]
#  x y
#2 2 3

Is one way to do this.

Your error comes from trying to make a data.frame numeric. The sapply option I show is instead making each column vector numeric.

user1317221_G
  • 13,886
  • 2
  • 46
  • 72
  • Error in lapply(X = X, FUN = FUN, ...) : (converted from warning) NAs introduced by coercion 1: as.data.frame(sapply(time, as.numeric)) 2: sapply(time, as.numeric) 3: lapply(X = X, FUN = FUN, ...) 4: .signalSimpleWarning("NAs introduced by coercion", quote(lapply(X = X, FUN = FUN, ...))) 5: withRestarts({ 6: withOneRestart(expr, restarts[[1]]) 7: doWithOneRestart(return(expr), restart) – ganesh reddy Feb 05 '13 at 21:47
  • Not sure how you mean, which sapply options? – ganesh reddy Feb 05 '13 at 21:47
  • Both the solutions dont work, I am getting the same error, maybe I am missing something? – ganesh reddy Feb 05 '13 at 21:52
  • like I said in my comment, if you provide some sample data we can get to the bottom of this. :) Try posting the output of `dput(yourdata)` – user1317221_G Feb 05 '13 at 21:54
  • I will try and make sense of the answers guys, I think I am fumbling smwhr, but shud be able to help myself :) – ganesh reddy Feb 05 '13 at 21:58
9

Note that data.frames are not numeric or character, but rather are a list which can be all numeric columns, all character columns, or a mix of these or other types (e.g.: Date/logical).

dat <- data.frame(x=c("NaN","2"),y=c("NaN","3"),stringsAsFactors=FALSE)
is.list(dat)
# [1] TRUE

The example data just has two character columns:

> str(dat)
'data.frame':   2 obs. of  2 variables:
 $ x: chr  "NaN" "2"
 $ y: chr  "NaN" "3

...which you could add a numeric column to like so:

> dat$num.example <- c(6.2,3.8)
> dat
    x   y num.example
1 NaN NaN         6.2
2   2   3         3.8
> str(dat)
'data.frame':   2 obs. of  3 variables:
 $ x          : chr  "NaN" "2"
 $ y          : chr  "NaN" "3"
 $ num.example: num  6.2 3.8

So, when you try to do as.numeric R gets confused because it is wondering how to convert this list object which may have multiple types in it. user1317221_G's answer uses the ?sapply function, which can be used to apply a function to the individual items of an object. You could alternatively use ?lapply which is a very similar function (read more on the *apply functions here - R Grouping functions: sapply vs. lapply vs. apply. vs. tapply vs. by vs. aggregate )

I.e. - in this case, to each column of your data.frame, you can apply the as.numeric function, like so:

data.frame(lapply(dat,as.numeric))

The lapply call is wrapped in a data.frame to make sure the output is a data.frame and not a list. That is, running:

lapply(dat,as.numeric)

will give you:

> lapply(dat,as.numeric)
$x
[1] NaN   2

$y
[1] NaN   3

$num.example
[1] 6.2 3.8

While:

data.frame(lapply(dat,as.numeric))

will give you:

>  data.frame(lapply(dat,as.numeric))
    x   y num.example
1 NaN NaN         6.2
2   2   3         3.8
Community
  • 1
  • 1
thelatemail
  • 81,120
  • 12
  • 111
  • 172
  • 2
    great answer. But R needs to fix this. Its things like this that turn people away from our community. – MadmanLee Apr 30 '19 at 16:37