Manually replace missing value in a column based on another column

Question

I've got a dataset that looks like this:

geo     mark     value
texas   nissan   2 
texas   nissan   78
ny      NA       65
ny      NA       15
ca      audi     22

I want to manually replace the NA in the column mark based on the geo column. So, in the example above, for every row of geo called 'ny' I want to insert 'toyota', like this:

geo     mark     value
texas   nissan   2 
texas   nissan   78
ny      toyota   65
ny      toyota   15
ca      audi     22

How to do this?

score 2 · Accepted Answer · answered Jan 07 '21 at 16:55

2

Does this work. Not sure if in your data you'd have NA for all values of geo == ny. Hence I've added & is.na(mark).

library(dplyr)
df %>% mutate(mark = case_when(geo == 'ny' & is.na(mark) ~ 'toyota', TRUE ~ mark))
# A tibble: 5 x 3
  geo   mark   value
  <chr> <chr>  <dbl>
1 texas nissan     2
2 texas nissan    78
3 ny    toyota    65
4 ny    toyota    15
5 ca    audi      22

answered Jan 07 '21 at 16:55

Karthik S

7,798
2
6
20

1

You could also replace the last condition (`TRUE ~ mark`) with `TRUE ~ replace_na(mark, "some_default")` if you want no leftover NA values – Charlie Gallagher Jan 07 '21 at 17:02

Manually replace missing value in a column based on another column

1 Answers1