Counting missing values in R

Question

I need to get the count of missing values across rows. I was able to do that using the apply function as follows:

x1=c(1:5,NA,8)
x2=c(1:4,NA,NA,8)
data_cmb=data.frame(x1,x2)
data_cmb$sum_na=apply(data_cmb,1,function(x)
  sum(is.na(x)))

data_cmb
  x1 x2 sum_na
1  1  1      0
2  2  2      0
3  3  3      0
4  4  4      0
5  5 NA      1
6 NA NA      2
7  8  8      0

I am learning dplyr these days. So I was wondering whether I can do the same thing using dplyr package in r. Will that be a possibility ?

I appreciate any comment.

Thank you

score 1 · Accepted Answer · answered Feb 02 '21 at 03:43

In dplyr you can use rowwise to count NA values by row.

library(dplyr)

data_cmb %>%
  rowwise() %>%
  mutate(sum_na = sum(is.na(c_across())))

#     x1    x2 sum_na
#  <dbl> <dbl>  <int>
#1     1     1      0
#2     2     2      0
#3     3     3      0
#4     4     4      0
#5     5    NA      1
#6    NA    NA      2
#7     8     8      0

Another option is pmap_dbl :

data_cmb %>% mutate(sum_na = purrr::pmap_dbl(., ~sum(is.na(c(...)))))

An efficient approach in base R would be using rowSums with is.na :

data_cmb$sum_na <- rowSums(is.na(data_cmb))

which can be written with dplyr pipes as :

data_cmb %>% mutate(sum_na =  rowSums(is.na(.)))

What's the advantage of the `pmap_dbl` option? The syntax is pretty hard to follow, especially compared to something as straightforward as `rowSums` — camille, Feb 02 '21 at 06:08

score 0 · Answer 2 · answered Feb 02 '21 at 05:19

0

We can use apply in base R

apply(data_cmb, 1, function(x) sum(is.na(x)))
#[1] 0 0 0 0 1 2 0

answered Feb 02 '21 at 05:19

akrun

674,427
24
381
486

Counting missing values in R

2 Answers2