Count number of occurrences in R

Question

For a sample dataframe:

df <- structure(list(area = c("a", "b", "c", "d", "e", "f", "g", "h", "i", "j", "k"), 
                      count = c(1L, 1L, 1L, 3L, 4L, 2L, 2L, 4L, 2L, 5L, 6L)), 
                 .Names = c("area", "count"), class = c("tbl_df", "tbl", "data.frame"), 
                 row.names = c(NA, -11L), spec = structure(list(cols = structure(list(area = structure(list(), 
                 class = c("collector_character", "collector")), count = structure(list(), class = c("collector_integer",
                 "collector"))), .Names = c("area", "count")), default = structure(list(), class = c("collector_guess", 
                "collector"))), .Names = c("cols", "default"), class = "col_spec"))

... which lists the number of occurrences of something per area, I wish to produce a another summary table showing how many areas have one occurrence, two occurrences, three occurrences etc. For example, there are three areas with 'One occurrence per area", three areas with 'Two occurrences per area", one area with 'Three occurrence per area" etc.

What is the best package/code to produce my desired result? I have tried with aggregate and plyr, but so far have had no success.

score 2 · Accepted Answer · answered Mar 27 '18 at 13:54

I like the data.table syntax

library(data.table)
setDT(df) # transform data.frame into data.table format

# .N calculates the number of observations, by instance of the count variable
df[, .(n_areas = .N), by = count]

   count n_areas
1:     1       3
2:     3       1
3:     4       2
4:     2       3
5:     5       1
6:     6       1

See this question for comparison between the two big packages that are most used for this kind of operation: dplyr and data.table data.table vs dplyr: can one do something well the other can't or does poorly?

Onyambu · Answer 2 · 2018-03-27T14:57:54.550

2

You can use base R functions: using @Jimbou solution

table(df$count)
1 2 3 4 5 6 
3 3 1 2 1 1

edited Mar 27 '18 at 14:57

answered Mar 27 '18 at 13:56

Onyambu

31,432
2
14
36

score 1 · Answer 3 · answered Mar 27 '18 at 14:01

1

This is quite intuitive using the wonderful dplyr library.

First, we group the data by the unique values of count, then we count the number of occurrences per group using n().

library(dplyr)
df %>%
    group_by(count) %>%
    summarise(number = n())

# A tibble: 6 x 2
  count number
  <int>  <int>
1     1      3
2     2      3
3     3      1
4     4      2
5     5      1
6     6      1

answered Mar 27 '18 at 14:01

David

171
5

1

simply try `df %>% count(count)`? – Roman Mar 27 '18 at 14:53
Even better ;) What was I thinking. Your comment should be the accepted answer. – David Mar 28 '18 at 09:43

Count number of occurrences in R

3 Answers3