A data set to work with:

df <- tibble::tribble(~person, ~age, ~height,  
                      "John", 1, 20,  
                      "Mike", 3, 50,  
                      "Maria", 3, 52,  
                      "Elena", 6, 90,  
                      "Biden", 9, 120)

I am trying to get a data frame that would have the following structure:

age | height(cm) | number of people  
0-5 | 0-50       |  2  
0-5 | 50-100     |  1  
0-5 | 100-200    |  0  
5-10 | 0-50       |  0  
5-10 | 50-100     |  1  
5-10 | 100-200    |  1

Basically, I have a data set that has a lot of information about a certain number of people. And I want to categorize this first by their age and inside of each age group to have a height group and in the end the number of people that belong in those categories.

any tips?

You can use cut() to generate bins from continuous variables, then summarise the new categories.


df %>%
    age_c = cut(
      breaks = c(-Inf, 5, 10),
      labels = c("0-5", "5-10"),
      right = TRUE
    height_c = cut(
      breaks = c(-Inf, 50, 100, 200),
      labels = c("0-50", "50-100", "100-200"),
      right = TRUE
  ) %>%
  count(age_c, height_c, .drop = FALSE)

# A tibble: 6 x 3
  age_c height_c     n
  <fct> <fct>    <int>
1 0-5   0-50         2
2 0-5   50-100       1
3 0-5   100-200      0
4 5-10  0-50         0
5 5-10  50-100       1
6 5-10  100-200      1
    very nice, I wasn't familiar with `cut` but it seems quite usefull – CroatiaHR Nov 10 '20 at 17:05
  • A follow up question, do you know if there is an easy adjustment to make it count the relative frequency and not the absolute? – CroatiaHR Nov 12 '20 at 00:31
  • this [post](https://stackoverflow.com/questions/24576515/relative-frequencies-proportions-with-dplyr) has relevant info for that – EJJ Nov 13 '20 at 15:59

In base R you could do:

data.frame(with(df, table(age=cut(age, c(0,5,10)), height=cut(height, c(0,50,100,200)))))

     age    height Freq
1  (0,5]    (0,50]    2
2 (5,10]    (0,50]    0
3  (0,5]  (50,100]    1
4 (5,10]  (50,100]    1
5  (0,5] (100,200]    0
6 (5,10] (100,200]    1
