1

I have 2-time series data with different time stamps and a different number of data points.

First data frame is:

      Time         Power_kW
10/9/2017 1:14:12   0.185
10/9/2017 1:14:53   0.182
10/9/2017 1:15:13   0.184
10/9/2017 1:15:53   0.175
. . 
. . 
10/9/2017 1:44:37   0.175
. . 
. . 
10/9/2017 2:13:38   0.181
. . 
. . 
10/9/2017 2:24:40   0.179

Second dataframe is:

    Local Time    Value
10/9/2017 1:13:01   0
10/9/2017 1:42:10   1
10/9/2017 2:11:58   0
10/9/2017 2:23:30   1

The second data frame values indicate the value in between 2 dates mentioned. It means the value of '0' should be propagated from 10/9/2017 1:13:01 0 to 10/9/2017 1:42:10. and the value 1 from the date of 10/9/2017 1:42:10 up to 10/9/2017 2:23:30 and so on.

I wish to have the same number of data points as the first data frame by merging these two data frames columns values. I have many similar data frames as the Second dataframe which all will help me making correlations with the respect to the time.

I am trying to get the following result after merging:

   Time            Power_kW Value
10/9/2017 1:14:12   0.185   0
10/9/2017 1:14:53   0.182   0
10/9/2017 1:15:13   0.184   0
10/9/2017 1:15:53   0.175   0
. .     
. .     
10/9/2017 1:44:37   0.175   1
10/9/2017 1:45:47   0.176   1
. .     
10/9/2017 2:13:38   0.181   0
. .     
. .     
10/9/2017 2:24:40   0.179   1

(I have added some points in the result to represent how the whole data should look like.)

I did refer to these threads and their purpose are not propagating through.

R: merge two irregular time series

merge two time series with different time granularities

Can anyone please throw me a light?

Mari
  • 145
  • 2
  • 13
  • Looks like a non-equi join to me. Search for that term and you should get a number of hits. If you need more help, please include a reproducible example in your question (use `dput`). This is especially important with questions involving time variables as it helps us better understand your data structure and makes it easier to copy and paste the data rather than try to build it from scratch. – lmo Oct 10 '17 at 12:17

2 Answers2

3

Try a rolling join using R's data.table package, which is explained nicely in this blog:

library(data.table)
df1 <- fread("Time,Power_kW
10/9/2017 1:14:12,0.185
10/9/2017 1:14:53,0.182
10/9/2017 1:15:13,0.184
10/9/2017 1:15:53,0.175
10/9/2017 1:44:37,0.175
10/9/2017 1:45:47,0.176 
10/9/2017 2:13:38,0.181
10/9/2017 2:24:40,0.179")
df2 <- fread("LocalTime,Value
10/9/2017 1:13:01,0
10/9/2017 1:42:10,1
10/9/2017 2:11:58,0
10/9/2017 2:23:30,1")
df1$Time <- as.POSIXct(df1$Time, format="%m/%d/%Y %T")
df2$LocalTime <- as.POSIXct(df2$LocalTime, format="%m/%d/%Y %T")
setkey(df1, Time)
setkey(df2, LocalTime)
df2[df1, roll=Inf]
# LocalTime Value Power_kW
# 1: 2017-10-09 01:14:12     0    0.185
# 2: 2017-10-09 01:14:53     0    0.182
# 3: 2017-10-09 01:15:13     0    0.184
# 4: 2017-10-09 01:15:53     0    0.175
# 5: 2017-10-09 01:44:37     1    0.175
# 6: 2017-10-09 01:45:47     1    0.176
# 7: 2017-10-09 02:13:38     0    0.181
# 8: 2017-10-09 02:24:40     1    0.179
lukeA
  • 48,497
  • 5
  • 73
  • 84
2

Assume the inputs given in the Note at the end. Convert the data frames to zoo and run na.locf to fill in the NA values. Finally select the elements corresponding to the first table and optionally convert back to data frame:

library(zoo)

# convert to zoo
Power_kW <- read.zoo(DF1, tz = "")
Value <- read.zoo(DF2, tz = "")

z <- na.locf(merge(Power_kW, Value))[time(Power_kW)]
fortify.zoo(z) # optional

giving:

                Index Power_kW Value
1 2017-10-09 01:14:12    0.185     0
2 2017-10-09 01:14:53    0.182     0
3 2017-10-09 01:15:13    0.184     0
4 2017-10-09 01:15:53    0.175     0
5 2017-10-09 01:44:37    0.175     1
6 2017-10-09 01:45:47    0.176     1
7 2017-10-09 02:13:38    0.181     0
8 2017-10-09 02:24:40    0.179     1

Note: We assume the following inputs. Note that we assume POSIXct date times.

Lines1 <- "
Time,Power_kW
10/9/2017 1:14:12,0.185
10/9/2017 1:14:53,0.182
10/9/2017 1:15:13,0.184
10/9/2017 1:15:53,0.175
10/9/2017 1:44:37,0.175
10/9/2017 1:45:47,0.176 
10/9/2017 2:13:38,0.181
10/9/2017 2:24:40,0.179"

Lines2 <- "
LocalTime,Value
10/9/2017 1:13:01,0
10/9/2017 1:42:10,1
10/9/2017 2:11:58,0
10/9/2017 2:23:30,1"

fmt <- "%m/%d/%Y %H:%M:%S"
DF1 <- read.csv(text = Lines1)   
DF2 <- read.csv(text = Lines2)

# convert date/time to POSIXct
fmt <- "%m/%d/%Y %H:%M:%S"
DF1$Time <- as.POSIXct(DF1$Time, format = fmt)
DF2$LocalTime <- as.POSIXct(DF2$LocalTime, format = fmt)
G. Grothendieck
  • 211,268
  • 15
  • 177
  • 297