I have a data set in R that looks something link this:
ClientID | Category | Date |
---|---|---|
Person1 | CategoryA | 2020-09-01 |
Person1 | CategoryA | 2020-09-30 |
Person2 | CategoryA | 2020-07-25 |
Person2 | CategoryA | 2020-08-31 |
Person1 | CategoryB | 2020-03-15 |
Person1 | CategoryB | 2020-09-14 |
Person2 | CategoryB | 2020-06-17 |
Person2 | CategoryB | 2020-10-10 |
What I would like to do is filter it to only return the row of the most recent date in each category for every client, so it would look like this:
ClientID | Category | Date |
---|---|---|
Person1 | CategoryA | 2020-09-30 |
Person1 | CategoryB | 2020-09-14 |
Person2 | CategoryA | 2020-08-31 |
Person2 | CategoryB | 2020-10-10 |
I have tried
library(plyr)
dataSet %>% filter(Category == "CategoryA",Date == max(Date))
I knew this wouldn't work the moment I typed it, but I am lost on where to go. I have considered subsetting the data on the different categories (there are only 4), but then I am still lost with filtering by max date per client (because at least then, I could rbind()
the results of each subset into a final data table). But alas, I am stuck.
Thanks in advance for the help.