0

I'm new to R programming. I read a csv file. I want to replace certain 'NA' values in a column with values present in the same row from some other column. So I have written a 'if statement' shown below, but instead of replacing only the 'NA' values, all the values in that column are getting replaced by the values present in another column. What possibly is going wrong here? Any help is welcomed. The data looks like

Group   Skill
 A1      ABC
 A1      ABC
 A1      ABC
 A1      ABC
 A1       
 A1      
 A1       
 A1

The desired result is

 Group   Skill
 A1      ABC
 A1      ABC
 A1      ABC
 A1      ABC
 A1      A1
 A1      A1
 A1      A1
 A1      A1

The result I'm getting now

Group   Skill
 A1      A1
 A1      A1
 A1      A1
 A1      A1
 A1      A1
 A1      A1
 A1      A1
 A1      A1

The if statement I wrote is

df<- read.csv("Data.csv",header=T,na.strings=c(""))
if (is.na(df$Skill)) {
    df$Skill <- df$Group
      }

2 Answers2

0

One option would be to use the coalesce function from the dplyr package:

require(dplyr)
df$Skill = coalesce(df$Skill, df$Group)

For rows where Skill has a non NA value, the value would remain as is. Otherwise, the NA would be replaced with whatever is in the Group column.

Tim Biegeleisen
  • 387,723
  • 20
  • 200
  • 263
  • Hi @Tim Biegeleisen Thanks for your reply. But after I ran the 2 lines you've mentioned above, I got a Warning message in the console: "Warning message: In `[ – Kailash Sharma Oct 30 '18 at 10:01
0

This would be a solution with base R subsetting:

    df$Skill[is.na(df$Skill)] <- df$Group[is.na(df$Skill)]

Or with dplyr:

    library(dplyr)
    df %>% mutate(Skill = ifelse(is.na(Skill, Group, Skill))

It takes the Group value if Skill is NA and the Skill value if a Skill value exists.

FloSchmo
  • 663
  • 4
  • 9