Count NAs per row in dataframe

Question

I've got dataframe that has batch ID and the results of six tests performed on each batch. The data looks like this:

batch_id  test1  test2  test3  test4  test5  test6
001       0.121     NA  0.340  0.877  0.417  0.662
002       0.229  0.108     NA  0.638     NA  0.574

(there are a few hundred rows in this dataframe, only one row per batch_id)

I'm looking for a way to count how many NAs there are for each batch_id (for each row). I feel like this should be do-able with a few lines of R code at the most, but I'm having trouble actually coding it. Any ideas?

@BenBolker Generally, I have the impression that answers to recent posts are often more appropriate, modern, or efficient than those in the alleged duplicates - especially if the duplicate post is several years old (not the case here). In this specific case, however, I'm not even sure that we're dealing with a duplicate since the linked question specifically asked for a `dplyr` solution, unlike the OP of this post. — RHertel, Jun 14 '16 at 05:10
OK, although this particular question isn't that old (Feb of this year) and the *answers* (esp. @windrunn3r.1990's answer) overlap a lot . Should I/we vote to reopen? — Ben Bolker, Jun 14 '16 at 12:51
@BenBolker I did not see the question you linked to when I searched for a solution. The answer to that question by Justin is what I was looking for. Should I delete my question? — Shark7, Jun 14 '16 at 23:28
@ BenBolker OK. Should select one of the answers to the question I posted? Tim Biegeleisen posted a solution that works well, so I feel that he should get some credit. — Shark7, Jun 14 '16 at 23:33

score 82 · Answer 1 · answered Jun 14 '16 at 01:30

82

You can count the NAs in each row with this command:

rowSums(is.na(dat))

where dat is the name of your data frame.

answered Jun 14 '16 at 01:30

Sven Hohenstein

75,536
15
130
155

3

This solution is excellent and vectorized. Thank you. – ADF Nov 16 '18 at 13:53

score 38 · Accepted Answer · answered Jun 14 '16 at 00:58

38

You could add a new column to your data frame containing the number of NA values per batch_id:

df$na_count <- apply(df, 1, function(x) sum(is.na(x)))

answered Jun 14 '16 at 00:58

Tim Biegeleisen

387,723
20
200
263

4

Thanks. That works. I ended up using this, which is a bit simpler:
`df$na_count – Shark7 Jun 14 '16 at 23:45

Count NAs per row in dataframe

2 Answers2

Linked

Related