4

I am using dplyr's mutate function to create a POSIX date column of a data frame by taking the lead of another column. When I try to fill in the missing values in the lead function using a single date, I get an error:

> dates
# A tibble: 5 x 1
  orig_date          
  <dttm>             
1 2016-06-21 20:00:00
2 2016-07-09 22:00:00
3 2016-07-10 22:00:00
4 2016-07-20 21:00:00
5 2016-07-21 21:00:00

> fillin_date
[1] "2018-08-29 UTC"

> dates %>% mutate(next_date = lead(orig_date, 1, default = fillin_date))

Error in mutate_impl(.data, dots) : 
  Not compatible with requested type: [type=symbol; target=double].

This does not happen outside of mutate:

> lead(dates$orig_date, 1, default = fillin_date)
[1] "2016-07-09 22:00:00 UTC" "2016-07-10 22:00:00 UTC" "2016-07-20 21:00:00 UTC"
[4] "2016-07-21 21:00:00 UTC" "2018-08-29 00:00:00 UTC"

What is going wrong here?

  • 1
    It looks like the `default=` argument has to be a value and can't be a reference to an object (symbol) when used in a dplyr pipeline. If you do `dates %>% mutate(next_date = lead(orig_date, 1, default = as.POSIXct("2018-08-29")))` it seems to work. This strikes me as a bug, but I'll see if someone can identify a better solution. Even doing `dates %>% mutate(next_date = lead(orig_date, 1, default = (fillin_date)))` makes it work because the call gets evaluated. (or something like that). – thelatemail Oct 03 '18 at 22:21
  • 1
    Submitted an [issue](https://github.com/tidyverse/dplyr/issues/3878) – Calum You Oct 03 '18 at 22:39

1 Answers1

1

I am not sure as to the underlying reason why you can supply the symbol outside of mutate but not inside, but you can get around it by quoting and unquoting the variable. You can also save your date to fill in as character and just convert to date inside the mutate call.

library(tidyverse)
df <- tibble(orig_date = c("2016-06-21 20:00:00", "2016-07-09 22:00:00", "2016-07-10 22:00:00", "2016-07-20 21:00:00", "2016-07-21 21:00:00")) %>%
  mutate(orig_date = as.POSIXct(orig_date))

fillin_date <- as.POSIXct("2018-08-29")
fillin_date2 <- "2018-08-29"
df %>%
  mutate(next_date = lead(orig_date, 1, default = !!quo(fillin_date)))
#> # A tibble: 5 x 2
#>   orig_date           next_date          
#>   <dttm>              <dttm>             
#> 1 2016-06-21 20:00:00 2016-07-09 22:00:00
#> 2 2016-07-09 22:00:00 2016-07-10 22:00:00
#> 3 2016-07-10 22:00:00 2016-07-20 21:00:00
#> 4 2016-07-20 21:00:00 2016-07-21 21:00:00
#> 5 2016-07-21 21:00:00 2018-08-29 00:00:00
df %>%
  mutate(next_date = lead(orig_date, 1, default = as.POSIXct(fillin_date2)))
#> # A tibble: 5 x 2
#>   orig_date           next_date          
#>   <dttm>              <dttm>             
#> 1 2016-06-21 20:00:00 2016-07-09 22:00:00
#> 2 2016-07-09 22:00:00 2016-07-10 22:00:00
#> 3 2016-07-10 22:00:00 2016-07-20 21:00:00
#> 4 2016-07-20 21:00:00 2016-07-21 21:00:00
#> 5 2016-07-21 21:00:00 2018-08-29 00:00:00

Created on 2018-10-03 by the reprex package (v0.2.0).

Calum You
  • 12,622
  • 2
  • 17
  • 35