Replace Zero by value of another variable

Question

This post similar with this post Replace NA in column with value in adjacent column But now if x6=0, it must be return by value of x5. If i do so

mydat$X6[0(mydat$X6)] <- mydat$X5[0(mydat$X6)]

of course i have this Error: attempt to apply non-function

 mydat=structure(list(ItemRelation = c(158200L, 158204L), DocumentNum = c(1715L, 
                                                                         1715L), CalendarYear = c(2018L, 2018L), X1 = c(0L, 0L), X2 = c(0L, 
                                                                                                                                        0L), X3 = c(0L, 0L), X4 = c(NA, NA), X5 = c(107L, 105L), X6 = c(0, 
                                                                                                                                                                                                        0)), .Names = c("ItemRelation", "DocumentNum", "CalendarYear", 
                                                                                                                                                                                                                         "X1", "X2", "X3", "X4", "X5", "X6"), class = "data.frame", row.names = c(NA, 
                                                                                                                                                                                                                                                                                                  -2L))

How to replace zero by x6 on x5 value to get derided output

  ItemRelation DocumentNum CalendarYear X1 X2 X3 X4  X5 X6
1       158200        1715         2018  0  0  0 NA 107 107
2       158204        1715         2018  0  0  0 NA 105 105

score 3 · Answer 1 · answered Sep 03 '18 at 07:47

Create a logical vector and use that to subset both the replacement column and the replacee column to get the lengths equal while doing the assignment operation

i1 <- mydat$X6 == 0
mydat$X6[i1] <- mydat$X5[i1]

The 0(mydat$X6) syntax is not clear - may be representation of a pseudo function

score 3 · Answer 2 · answered Sep 03 '18 at 07:52

You can also use replace, i.e.

mydat$X6 <- with(mydat, replace(X6, X6 == 0, X5[X6 == 0]))

#  ItemRelation DocumentNum CalendarYear X1 X2 X3 X4  X5  X6
#1       158200        1715         2018  0  0  0 NA 107 107
#2       158204        1715         2018  0  0  0 NA 105 105

Andre Elrico · Answer 3 · 2018-09-03T08:04:36.050

You can use ?ifelse

mydat$X6 <- ifelse(mydat$X6 == 0, mydat$X5, mydat$X6)

#  ItemRelation DocumentNum CalendarYear X1 X2 X3 X4  X5  X6
#1       158200        1715         2018  0  0  0 NA 107 107
#2       158204        1715         2018  0  0  0 NA 105 105

looking at the benchmarks for a larger dataset. Ifelse seems to perform slower than the other 2.

mydat <- data.frame(X6=1:999999,X5=sample(0:1,999999,replace = T))

akrun <- function(mydat) {
    i1 <- mydat$X6 == 0
mydat$X6[i1] <- mydat$X5[i1]
}

sotos <- function(mydat) {
    mydat$X6 <- with(mydat, replace(X6, X6 == 0, X5[X6 == 0]))
}

elrico <- function(mydat) {
    mydat$X6 <- ifelse(mydat$X6 == 0, mydat$X5, mydat$X6)
}

microbenchmark::microbenchmark(elrico(mydat),akrun(mydat),sotos(mydat), times = 100)

#Unit: milliseconds
#          expr       min        lq      mean    median        uq      max neval cld
# elrico(mydat) 42.809477 47.591964 56.814627 49.750948 51.972969 148.7152   100   c
#  akrun(mydat)  5.068961  5.206103  8.277144  5.399385  9.516853 106.4254   100 a  
#  sotos(mydat)  7.966428  8.199167 16.903062 11.996958 13.774511 110.4206   100  b

So if you need speed and working with lager datasets take akrun's or sotos solution. Else, you can take mine which is IMO syntactically the most "beautiful".

Replace Zero by value of another variable

3 Answers3