0

Say this is my data.

mydat=structure(list(ItemRelation = c(158200L, 158204L), DocumentNum = c(1715L, 
1715L), CalendarYear = c(2018L, 2018L), X1 = c(0L, 0L), X2 = c(0L, 
0L), X3 = c(0L, 0L), X4 = c(NA, NA), X5 = c(107L, 105L), X6 = c(NA, 
NA)), .Names = c("ItemRelation", "DocumentNum", "CalendarYear", 
"X1", "X2", "X3", "X4", "X5", "X6"), class = "data.frame", row.names = c(NA, 
-2L))

How can I create the condition that if X6=NA, then replace NA by value of X5?

In this example, the desired output would be:

  ItemRelation DocumentNum CalendarYear X1 X2 X3 X4  X5  X6
1       158200        1715         2018  0  0  0 NA 107 107
2       158204        1715         2018  0  0  0 NA 105 105
Ronak Shah
  • 286,338
  • 16
  • 97
  • 143
D.Joe
  • 1,279
  • 2
  • 13
  • 28

1 Answers1

0

You can use sapply in base R:

mydat[,c("X5","X6")] <- with(mydat, sapply(mydat[8:9],function(x) ifelse(is.na(X6),X5,X6)))

Giving the desired solution:

  ItemRelation DocumentNum CalendarYear X1 X2 X3 X4  X5  X6
1       158200        1715         2018  0  0  0 NA 107 107
2       158204        1715         2018  0  0  0 NA 105 105

Explanation:

ifelse examines whether the X6 value for a given row is NA, and if so, selects the value of X5 from that row. If X6 is not NA, then just X6 is used.

sapply allows you to quickly apply this ifelse function to every row of your data.frame.

with changes the environment so that you're "within" your mydat object so that you can refer to its parts without using $ or [].

theforestecologist
  • 3,778
  • 2
  • 44
  • 79
  • How can this be a correct answer if the contents of `X5` is copied over all other columns? Even `ItemRelation` and `DocumentNum` have been overwritten. – Uwe Sep 10 '18 at 09:18
  • Your edit makes it worse. Column `X6` is now replaced by **two** columns `X6.X5` and `X6.X6`. This is caused by the unnecessary call to `sapply()`. IMHO, the correct base R solution would be `mydat$X6 – Uwe Sep 11 '18 at 08:27
  • @Uwe, I noticed that I totally misread things the first time, which is why I initially incorporated the `sapply`. I stuck with it since that's the approach that was "accepted" by the OP, but then I realized I also made a typo in my edits. It's now fixed. I agree that Ronak's answer is better. If he does not add it as an answer, I can incorporate it into mine. – theforestecologist Sep 11 '18 at 19:50