0

I want to set the NA values to zero for a specific column. I followed the instructions from this question, and the solution works at the R prompt.

However, it does not work when I place the commands in a function.

Here is an example:

d <- data.frame(colA=c("Joe", "Jane", "Jack"), colB=c(25, NA, 35), colC=c(100, 200, NA))
d
#   colA colB colC
# 1  Joe   25  100
# 2 Jane   NA  200
# 3 Jack   35   NA

I want to remove the NA in colB, so I followed another StackOverflow post to produce this working command: d$colB[is.na(d$colB)] <- 0

But now I want to write a function so that I don't have to type the column name twice, so the column is passed as an argument.

setNAToValue <- function(column, value) {
  column[is.na(column)] <- value
}

However, when I apply it, nothing happens:

setNAToValue(d$colB, 0)
d
#   colA colB colC
# 1  Joe   25  100
# 2 Jane   NA  200
# 3 Jack   35   NA

Now, when I change the <- to <<- (following the instructions in this post), I get an error:

setNAToValue(d$colB, 0)
# Error in column[is.na(column)] <<- value : object 'column' not found

How can I fix the problem?

Community
  • 1
  • 1
stackoverflowuser2010
  • 29,060
  • 31
  • 142
  • 184
  • You probably want to not do that in a function. Functions are generally self-contained, with inputs, outputs and no side effects (like modifying things outside). If you really want a function install data.table and do `setColNAToValue = function(colname, value, data=d) set(data, i = which(is.na(d[[colname]])), j=colname, v=value)` – Frank Mar 17 '16 at 01:45
  • 2
    `d$colB` will be passed in to the function as an object, not as a reference to a column in `df`. You need to pass in both the `df` and `"b"` and then do the replacement and then return all of `df` – thelatemail Mar 17 '16 at 01:46
  • 5
    you seem to be rewriting the `replace` function `replace(d, col(d) == 2 & is.na(d), 0)` – rawr Mar 17 '16 at 01:49
  • @thelatemail: How do I concatenate `df` and `"b"` to form a reference to the column `df$b`? – stackoverflowuser2010 Mar 17 '16 at 02:33
  • @stackoverflowuser2010 - something like `f – thelatemail Mar 17 '16 at 03:05

1 Answers1

2

Try writing an R replacement function:

"setNA<-" <- function(x, value) ifelse(is.na(x), value, x)

# test using d from question
setNA(d$colB) <- 0

Now we have:

> d
  colA colB colC
1  Joe   25  100
2 Jane    0  200
3 Jack   35   NA
G. Grothendieck
  • 211,268
  • 15
  • 177
  • 297
  • 1
    Neat, but wouldn't `"setNA – thelatemail Mar 17 '16 at 03:09
  • It might be a bit faster -- you would have to test it. In that case you could use `replace(x, is.na(x), value)` to reduce the body to a single statement; however, `ifelse` is pretty clear and that might be the overriding consideration rather than speed. – G. Grothendieck Mar 17 '16 at 03:15