2

I'm working on a Shiny app, where one of the options is aggregating data by Year-Month-Day of the week.

library(ggplot2)
library(dplyr)

pnd <- data.frame(c(rep('MEXICALI',900),rep('SALTILLO',900) ),sample(200:1600, 1800, T),sample(200:1600, 1800, T),rep(seq.POSIXt(from = as.POSIXct(Sys.Date()-90), length.out = 900, by = "1 hour"),2))

colnames(pnd) <- c('zona_carga', 'PrecioMDA', 'PrecioMTR', 'ID')

pnd <- pnd %>% select(ID, zona_carga,PrecioMDA, PrecioMTR)  %>%
  mutate(ID = format(ID, '%Y-%m %a')  ) %>% group_by( ID, zona_carga) %>% summarise(PrecioMDA = mean(PrecioMDA), PrecioMTR = mean(PrecioMTR)) 

colors <- c('MEXICALI - PrecioMDA' = 'steelblue', 'SALTILLO - PrecioMTR' = 'magenta')

ggplot(pnd, aes(x = ID) ) + 
  geom_line(data = filter(pnd, zona_carga == 'MEXICALI'), aes(y = as.numeric(PrecioMDA),group='PrecioMDA', color = paste('MEXICALI','-','PrecioMDA')) ) + 
  geom_line(data = filter(pnd, zona_carga == 'SALTILLO'), aes(y = as.numeric(PrecioMTR),group='PrecioMTR', color = paste('SALTILLO','-','PrecioMTR') ))  +
  labs(y='$MXN/MWh',x='Fecha',color = 'legend') + scale_color_manual(values = colors) + 
  scale_x_discrete( )

The problem is that, when the date interval increase, labels start to get mixed. Is there any way to specify dynamic breaks in my x axis? something similar to scale_x_date(breaks = '1 day')

Garcher
  • 69
  • 4

2 Answers2

3

You could wrap the x-axis labels, so their width is smaller. Note in the code below, I've streamlined the use of geom_line and the colour mapping and I've also set the order of the data to follow the order of the dates. I'm not sure if that's what you wanted, but the ordering in your example didn't seem correct.

set.seed(958)
pnd <- data.frame(c(rep('MEXICALI',900),rep('SALTILLO',900) ),sample(200:1600, 1800, T),sample(200:1600, 1800, T),rep(seq.POSIXt(from = as.POSIXct(Sys.Date()-90), length.out = 900, by = "1 hour"),2))

colnames(pnd) <- c('zona_carga', 'PrecioMDA', 'PrecioMTR', 'ID')

pnd <- pnd %>% 
  select(ID, zona_carga,PrecioMDA, PrecioMTR)  %>%
  arrange(ID) %>% 
  mutate(ID = format(ID, '%Y-%m %a'),
         ID = factor(ID, levels=unique(ID))) %>% 
  group_by( ID, zona_carga) %>% 
  summarise(PrecioMDA = mean(PrecioMDA), PrecioMTR = mean(PrecioMTR)) 

colors <- c('MEXICALI - PrecioMDA' = 'steelblue', 'SALTILLO - PrecioMDA' = 'magenta')

ggplot(pnd, aes(x = ID, y=PrecioMDA, 
                group=paste0(zona_carga, " - PrecioMDA"),
                colour=paste0(zona_carga, " - PrecioMDA"))) + 
  geom_line() +
  labs(y='$MXN/MWh',x='Fecha',color = 'legend') + 
  scale_color_manual(values = colors) + 
  scale_x_discrete(labels=function(x) c(rbind(x[seq(1,length(x), 2)], rep(" ", ceiling(length(x)/2))))[1:length(x)]) +
  theme_classic() +
  theme(legend.position="bottom")

enter image description here

If you want to alternate x-axis labels (as suggested by @RonakShah) another option is to keep all the tick marks but remove the text from every other one:

ggplot(pnd, aes(x = ID, y=PrecioMDA, 
                group=paste0(zona_carga, " - PrecioMDA"),
                colour=paste0(zona_carga, " - PrecioMDA"))) + 
  geom_line() +
  labs(y='$MXN/MWh',x='Fecha',color = 'legend') + 
  scale_color_manual(values = colors) +  
  scale_x_discrete(labels=function(x) c(rbind(x[seq(1,length(x), 2)], rep(" ", ceiling(length(x)/2))))[1:length(x)]) +
  theme_classic() +
  theme(legend.position="bottom")

enter image description here

But does the "2020-04" etc. really need to be repeated. Another option might be to have the "2020-04" only on the first day for that month, and then just list the days of the week after that (and similarly for each new month).

eipi10
  • 81,881
  • 20
  • 176
  • 248
  • I didn't notice in my example both variables came from column PrecioMDA. I edited the question as values come both from 'PrecioMDA' and 'PrecioMTR'. That's why I use the filter inside geom_line and dont declare ``aes(y = PrecioMDA) `` in ggplot. Is there any way to use your scale function when x is not declared on ggplot? Thx – Garcher Jul 29 '20 at 00:52
  • Then I'd suggest reshaping your data to "long" format so that you'll need only one call to geom_line and you can put all the aesthetics in the main ggplot call. I'll try to update my answer in the next day or two, but there are many other examples of this approach on SO. – eipi10 Jul 29 '20 at 02:16
3

Since the labels are repeated how about you show only alternate labels ?

library(ggplot2)

 ggplot(pnd, aes(x = ID) ) + 
   geom_line(data = filter(pnd, zona_carga == 'MEXICALI'), 
        aes(y = as.numeric(PrecioMDA),group='PrecioMDA', color = paste('MEXICALI','-','PrecioMDA')) ) + 
   geom_line(data = filter(pnd, zona_carga == 'SALTILLO'), 
        aes(y = as.numeric(PrecioMDA),group='PrecioMDA', color = paste('SALTILLO','-','PrecioMDA') ))  +
   labs(y='$MXN/MWh',x='Fecha',color = 'legend') + scale_color_manual(values = colors) + 
   scale_x_discrete(breaks = function(x) {x[c(TRUE, FALSE)] <- '';x})

enter image description here

If the dates still don't fit in the plot you can consider Rotating and spacing axis labels in ggplot2 .

Ronak Shah
  • 286,338
  • 16
  • 97
  • 143