I have a large dataset which has two fields that record dates (firstVisit
and secondVisit
).
I want to find out how many people visited a specific site
on any given date, regardless of whether it is their first or second visit.
For example, let's say that on the 1st February 2021 at a particular site, 100 people attended their firstVisit
and 25 people attended their secondVisit
; I want to be able to say that on 1st February 2021, 125 people visited.
At the moment I have this (using dplyr
):
Site firstVisit secondVisit
site1 100 25
site2 113 100
site3 84 90
I'm sure that I could just sum the columns up like this:
Site firstVisit secondVisit totalVisits
site1 100 25 125
site2 113 100 213
site3 84 90 174
Ideally I want it to look like this:
Date Site TotalVisits
01/01/2020 site1 125
02/01/2020 site1 213
02/02/2020 site2 174
...
27/02/2020 site1 326
27/02/2020 site3 301
28/02/2020 site4 217
Is there a better approach?
How can I do this?
Sample data:
df <- data.frame(
site = sprintf("site%s",seq(1:5)),
firstVisit = sample(seq(as.Date('2021/01/01'), as.Date('2021/02/28'), by = "day"), 100, replace = TRUE),
secondVisit = sample(seq(as.Date('2021/02/01'), as.Date('2021/03/31'), by = "day"), 100, replace = TRUE)
)