0

I have a problem that may be basic, but where I´m really struggling to find a solution. I´m making a report that should be updated with new dates and values every month. I need to be able to dynamically choose columns to do calculations on, but can´t seem to get this done with vectors. Can somebody please explain where my thinking is wrong?

I have a dataframe like this:

library(lubridate)
library(dplyr)

Dates <- as.Date(c("2018-10-01", "2018-09-01", "2018-08-01", "2018-07-01"))
Value1 <- c(456, 845, 687, 456)
Value2 <- c(895, 547, 123, 632)
df <- data.frame(Date = as.Date(Dates), Value1, Value2)

I also specify the start date of the two last months like this:

today <- Sys.Date()
fst_in_mth <- floor_date(today, "month")
fst_last_mth <- floor_date(today %m+% months(-1), "month")

I then use these vectors to filter the dates (so far, this works):

df_filtered <- df%>%
  filter(Date %in% c(fst_in_mth, fst_last_mth)) 

That gives me this data frame:

Date        Value1 Value2
2018-10-01   456    895
2018-09-01   845    547

This is where I run into trouble. I want to calculate the difference between these two months, and add that to the table. This is what I want to achieve, where change is stored in the new column:

Date    2018-09-01 2018-10-01 Change
Value1     845        456      -389
Value2     547        895       348

I transpose to get a table with dates as columns (I´m used to doing this by column, but maybe I should do it with rows instead?), but now I can no longer use the vectors to identify the correct columns:

df_ts <- t(df_filtered)
df_result <- df_ts%>%
  mutate(Change = (fst_in_mth - fst_last_mth))

Error in UseMethod("mutate_") : 
no applicable method for 'mutate_' applied to an object of 
class "c('matrix', 'character')" 

I also get an error if I try using base R:

df_ts$Change <- (df_ts$fst_in_mth - df_ts$fst_last_mth)

Error in df_ts$fst_in_mth : $ operator is invalid for atomic vectors

Any idea how I can solve this?

1 Answers1

2

One way to get the result you want is by gathering the data, calculating the difference by group then using spread to reshape it back to wide format:

df_filtered %>%
  gather(key, value, -Date) %>%
  group_by(key) %>%
  mutate(Change = diff(-value)) %>%
  spread(key = Date, value = value)

# A tibble: 2 x 4
# Groups:   key [2]
  key    Change `2018-09-01` `2018-10-01`
  <chr>   <dbl>        <dbl>        <dbl>
1 Value1   -389          845          456
2 Value2    348          547          895
27 ϕ 9
  • 17,064
  • 3
  • 26
  • 36
  • 1
    maybe set `diff(value)` to `diff(-value)` to get exactly the expected output ;) – piptoma Oct 31 '18 at 13:23
  • Thanks, group by is definitely the right way to do it, but how do I do it if I want to get the difference for multiple dates? In the example I only included the difference between current month vs. last month, but I also want to know the difference between this month and first day of the year. So there are two change columns, "change since last month" and "change this year". – Thomas Tallaksen Nov 01 '18 at 11:07
  • Perhaps post a new question with a reproducible example of your data and the expected output. I'm not entirely sure how the data is originally structured from your comment. – 27 ϕ 9 Nov 01 '18 at 13:33