0

I need to parse dates and have a cases like "31/02/2018":

library(lubridate)
> dmy("31/02/2018", quiet = T)
[1] NA

This makes sense as the 31st of Feb does not exist. Is there a way to parse the string "31/02/2018" to e.g. 2018-02-28 ? So not to get an NA, but an actual date?

Thanks.

c0bra
  • 672
  • 3
  • 16

1 Answers1

1

We can write a function assuming you would only have dates which could be higher than the actual date and would have the same format always.

library(lubridate)

get_correct_date <- function(example_date) {
  #Split vector on "/" and get 3 components (date, month, year)
  vecs <- as.numeric(strsplit(example_date, "\\/")[[1]])

  #Check number of days in that month
  last_day_of_month <-  days_in_month(vecs[2])

  #If the input date is higher than actual number of days in that month
  #replace it with last day of that month
  if (vecs[1] > last_day_of_month)
    vecs[1] <- last_day_of_month

  #Paste the date components together to get new modified date
  dmy(paste0(vecs, collapse = "/"))
}


get_correct_date("31/02/2018")
#[1] "2018-02-28"

get_correct_date("31/04/2018")
#[1] "2018-04-30"

get_correct_date("31/05/2018")
#[1] "2018-05-31"

With small modification you can adjust the dates if they have different format or even if some dates are smaller than the first date.

Ronak Shah
  • 286,338
  • 16
  • 97
  • 143
  • Thanks for the solution. I also checked last_day_of_month for NA as some dates even had a faulty month and could not be recovered. – c0bra Nov 12 '18 at 11:21
  • @c0bra I think month would need a separate treatment. This works only for dates. You can add a new question for that. – Ronak Shah Nov 12 '18 at 11:45