0

I have an excel dataset in which there are dates and time points as follows:

record_id   date_E1        time_E1    date_E2     time_E2     ...
   1       2019/8/24      09:00:00    2019/8/25   18:00:00 

I would like to construct a variable which contains the number of hours past the first time and date, (09:00 a.m 2019/8/24). When I read the excel file with read_excel("C:/visit.xlsx") the time_E1 .. appears as 0.3750000 0.7736111 0.4131944 0.4131944, and the date appears as 43640 43640 43641 43642, in R. I use visit_dates<-as.Date(as.numeric(visit_date_L$Day), origin = "1899-12-30") to convert dates to 2019-8-24 and .. but do not know how to convert time of the day and convert to the hours past the first time point. What I expect is a vector like: 0, 42, ... hours past first time point.

I have used the following code:

as.POSIXct(visit_times, format = " %H-%M", origin = "09:00:00"), 

but it returns a NULL vector. After that I could use the following code to transpose and combine date and time data:

visit_time <- subset(MY_visit, select = c(record_id,  time_E1, ...)
visit_date <- subset(MY_visit, select = c(record_id,  date_E1,...)

visit_time_L <- melt(visit_time, id.vars=c("record_id"))
visit_date_L <- melt(visit_date, id.vars=c("record_id"))

names(visit_time_L)[names(visit_time_L)=="value"] <- "time"
names(visit_date_L)[names(visit_date_L)=="value"] <- "Day"



visit_all <- cbind(visit_time_L, visit_date_L)

Any ideas how can I solve this problem?

Shaheryar.Akram
  • 714
  • 11
  • 18
  • Related: https://stackoverflow.com/questions/43230470/how-to-convert-excel-date-format-to-proper-date-with-lubridate – MrFlick Aug 23 '19 at 15:10
  • 1
    Possible duplicate: https://stackoverflow.com/questions/54474404/how-to-detect-time-when-reading-from-an-excel-sheet-using-r – MrFlick Aug 23 '19 at 15:11
  • As I need to clean my data after reading in R these links do not seem to solve my problem, the date and time are not in separate columns to be handled. – Ecatrina Smith Aug 23 '19 at 15:55

1 Answers1

0

Here is an approach that you can try. I have dates/times stored in an Excel file. Read it in and keep the columns as characters. Convert the dates to their proper format, as you did. Convert the fractions of the time of day to numeric and multiply by 24. Paste the dates/times together and convert to date format, then find the difference between the two in hours (the result will be in days, so multiply by 24).

library(dplyr);library(readxl); library(lubridate)

df <- read_excel('Book1.xlsx',col_types = c('text'))

# A tibble: 1 x 4
  date1 time1 date2 time2              
  <chr> <chr> <chr> <chr>              
1 43466 0.375 43467 0.41666666666666669

df %>% mutate_at(c('date1','date2'), ~ as.Date(as.numeric(.),origin='1899-12-30')) %>% 
  mutate_at(c('time1','time2'), ~ as.numeric(.)*24) %>% 
  mutate(t1=ymd_h(paste(date1,time1)),
         t2=ymd_h(paste(date2,time2)),
         diff=as.numeric(t2-t1)*24)

# A tibble: 1 x 7
  date1      time1 date2      time2 t1                  t2                   diff
  <date>     <dbl> <date>     <dbl> <dttm>              <dttm>              <dbl>
1 2019-01-01     9 2019-01-02    10 2019-01-01 09:00:00 2019-01-02 10:00:00    25
kstew
  • 1,048
  • 5
  • 21