4

I need to convert a string of dates that is in multiple formats to valid dates.

e.g.

dates <- c("01-01-2017","02-01-2017","12-01-2016","20160901","20161001", "20161101")

> as.Date(dates, format=c("%m-%d-%Y","%Y%m%d"))
[1] "2017-01-01" NA           "2016-12-01" "2016-09-01" NA           "2016-11-01"

two dates show as NA

user3357059
  • 976
  • 11
  • 25

2 Answers2

10

This is pretty much I wrote the anytime package for:

R> dates <- c("01-01-2017","02-01-2017","12-01-2016","20160901","20161001", 
+             "20161101")
R> library(anytime)
R> anydate(dates)
[1] "2017-01-01" "2017-02-01" "2016-12-01" "2016-09-01" 
[5] "2016-10-01" "2016-11-01"
R> 

Parse any sane input reliably and without explicit format or origin or other line noise.

That being said, not starting ISO style with the year is asking for potential trouble, so 02-03-2017 could be February 3 or March 2. I am following the North American convention I too consider somewhat broken -- but is so darn prevalent. Do yourself a favour and try to limit inputs to ISO dates, at least ISO order YYYYMMDD.

Dirk Eddelbuettel
  • 331,520
  • 51
  • 596
  • 675
  • To clarify, `anydate()` gives priority to mdy over dmy? – thelatemail Apr 13 '17 at 04:35
  • Yes. [See the formats](https://github.com/eddelbuettel/anytime/blob/master/src/anytime.cpp#L43-L106) which you can also list from within. – Dirk Eddelbuettel Apr 13 '17 at 11:14
  • I found this to be a little faster `as.Date(ifelse(grepl("-", dates), as.Date(dates, format = c("%m-%d-%Y")), as.Date(dates, format = c("%Y%m%d"))), origin = "1970-01-01")` – user3357059 Apr 19 '17 at 16:01
  • Is there any way to simply change the origin? I find this function really useful but have a need to change the origin when dealing with integer to "1899-12-30" because of Excel. – Croote May 06 '20 at 06:28
  • Hi @Croote I _guess_ one could. We use it as a compile-time constant and we could, I suppose, change it. With a whole new layer of testing so there is 'cost'. It may be simpler to write a local `fromExcel()` helper. It you want to follow up open an issue ticket with some sample data/ – Dirk Eddelbuettel May 06 '20 at 11:53
  • Hi , Thanks, I think I might just do that. In my company we are stuck pulling client data from excel all the time. – Croote May 07 '20 at 02:58
0

I have tried library(anytime), however for big data did not work. Then, I found useful this sequence:

df$Date2 <- format(as.Date(df$Date, format="%m/%d/%Y"), "%d/%m/%y")

df$Date2 <- as.Date(df$Date2,"%d/%m/%y")

It worked for me to "8/10/2005" as well as "08/13/05" in the same column.

Paul Roub
  • 35,100
  • 27
  • 72
  • 83