05:00:00 - 28:59:59 time format

Question

I have dataset where time.start vary from 5:00:00 to 28:59:59 (i.e. 01.01.2013 28:00:00 is actually 02.01.2013 04:00:00). Dates are in %d.%m.%Y format.

      Date Time.start   
01.01.2013   22:13:07
01.01.2013   22:52:23
01.01.2013   23:34:06
01.01.2013   23:44:25
01.01.2013   27:18:48
01.01.2013   28:41:04

I want to convert it to normal date format.

dates$date <- paste(dates$Date,dates$Time.start, sep = " ")
dates$date <- as.POSIXct(strptime(dates$date, "%m.%d.%Y %H:%M:%S"))

But obviously I have NA for time > 23:59:59

How should I modify my code?

Reminds me of the time ranges for Japanese TV show scheduling, which get silly when they use non-24 hour time (e.g. "late night 3:00" to mean 3am the next day). — JAB, Jun 19 '14 at 15:44
Indeed, you see bars advertising that they are open until "25:00" for 1AM etc. — , Jun 19 '14 at 15:51
Normalize by offsetting -5 hours in the database and add +5 when you actually use the data. :) `[With a -5, you be in 00:00:00 - 23:59:59 range.]` — CodeAngry, Jun 19 '14 at 17:06
@Panzercrisis: If a business were closed daily from 1:00am-7:00am, and had a work week that went from 00:00:00.01 Monday through 23:59:59.99 Sunday, a work shift that started at 5:00pm Sunday and ended at 1:00am on Monday would have seven hours attributable to one paycheck and one attributed to the next. Recording the shift as running from 17:00 through 25:00 on Sunday avoids that problem. — supercat, Jun 19 '14 at 18:43

score 9 · Accepted Answer · answered Jun 19 '14 at 12:04

9

E.g. add the time as seconds to the date:

df <- read.table(header=T, text="      Date Time.start   
01.01.2013   22:13:07
01.01.2013   22:52:23
01.01.2013   23:34:06
01.01.2013   23:44:25
01.01.2013   27:18:48
01.01.2013   28:41:04", stringsAsFactors=FALSE)

as.POSIXct(df$Date, format="%d.%m.%Y") +
  sapply(strsplit(df$Time.start, ":"), function(t) {
    t <- as.integer(t)
    t[3] + t[2] * 60 + t[1] * 60 * 60
  })

# [1] "2013-01-01 22:13:07 CET" "2013-01-01 22:52:23 CET" "2013-01-01 23:34:06 CET"
# [4] "2013-01-01 23:44:25 CET" "2013-01-02 03:18:48 CET" "2013-01-02 04:41:04 CET"

answered Jun 19 '14 at 12:04

lukeA

48,497
5
73
84

How about using `scan` to directly read in the ints from dates. No strsplit and no coercion. I really like this idea of yours. – asb Jun 19 '14 at 12:20
Your solution is faster than @akrun. I have data frame with 180 000 rows and it takes 2.5 secs to create your vector vs 31 secs using `with` command – BiXiC Jun 19 '14 at 13:25
Now I have problem with function for this script. =( http://stackoverflow.com/questions/24308718/050000-285959-time-format-function – BiXiC Jun 19 '14 at 14:12

score 8 · Answer 2 · answered Jun 19 '14 at 12:24

Just a modification of lukeAs solution:

with(df, as.POSIXct(Date, format="%d.%m.%Y")+
 colSums(t(read.table(text=Time.start, sep=":",header=F))*c(3600,60,1)))
[1] "2013-01-01 22:13:07 EST" "2013-01-01 22:52:23 EST" 
[3] "2013-01-01 23:34:06 EST" "2013-01-01 23:44:25 EST"
[5] "2013-01-02 03:18:48 EST" "2013-01-02 04:41:04 EST"

score 2 · Answer 3 · answered Jun 19 '14 at 15:24

2

Using lubridate:

with(dates, mdy(Date) + hms(Time.start))

Generates:

[1] "2013-01-01 22:13:07 UTC" "2013-01-01 22:52:23 UTC"
[3] "2013-01-01 23:34:06 UTC" "2013-01-01 23:44:25 UTC"
[5] "2013-01-02 03:18:48 UTC" "2013-01-02 04:41:04 UTC"

answered Jun 19 '14 at 15:24

rrs

8,457
4
24
34

05:00:00 - 28:59:59 time format

3 Answers3

Linked