4

I'm plotting data marked up using haven semantics, i.e. variables and values have labels defined via attributes.

Often, these labels are also what I want in my axis titles and ticks.

library(ggplot2)
mtcars$mpg = haven::labelled(mtcars$mpg, labels = c("low" = 10, "high" = 30))
attributes(mtcars$mpg)$label = "miles per gallon"
ggplot(mtcars, aes(mpg, cyl)) + geom_point() + 
scale_x_continuous(attributes(mtcars$mpg)$label, 
     breaks = attributes(mtcars$mpg)$labels, 
     labels = names(attributes(mtcars$mpg)$labels))

Could I write a helper that replaces that laborious scale_x_continuous statement with something that can more easily be iterated? E.g. something like scale_x_continuous(label_from_attr, breaks = breaks_from_attr, labels = value_labels_from_attr). Or maybe even + add_labels_from_attributes() to replace the whole thing?

I'm aware that I can write/use helpers like Hmisc::label to slightly shorten the attribute-code above, but that's not what I want here.

Ruben
  • 3,242
  • 27
  • 42
  • take a look at the `Hmisc` package which has quite a few "plotting with labels" features for ggplot. I'm not sure about the haven compatibility, but it uses the same labels as attribute specification, I think. – Matt L. Jan 25 '18 at 15:53

2 Answers2

3

I don't have a good scale, but you can use a function like this:

label_x <- function(p) {
  b <- ggplot_build(p)
  x <- b$plot$data[[b$plot$labels$x]]
  
  p + scale_x_continuous(
    attributes(x)$label, 
    breaks = attributes(x)$labels, 
    labels = names(attributes(x)$labels)
  )
}

Then use as (+ won't do):

p <- ggplot(mtcars, aes(mpg, cyl)) + geom_point()
label_x(p)

Alternatively, use a pipe:

mtcars %>% { ggplot(., aes(mpg, cyl)) + geom_point() } %>% label_x()

enter image description here


Old solution

use_labelled <- function(l, axis = "x") {
    if (axis == "x")  {
        scale_x_continuous(attributes(l)$label, 
                           breaks = attributes(l)$labels, 
                           labels = names(attributes(l)$labels))
    } 
    if (axis == "y") {
        scale_y_continuous(attributes(l)$label, 
                          breaks = attributes(l)$labels, 
                          labels = names(attributes(l)$labels))
    }
}

Then you just give:

ggplot(mtcars, aes(mpg, cyl)) + geom_point() + use_labelled(mtcars$cyl)

Or for the y-axis:

ggplot(mtcars, aes(cyl, mpg)) + geom_point() + use_labelled(mtcars$cyl, "y")
Community
  • 1
  • 1
Axeman
  • 27,115
  • 6
  • 69
  • 82
  • Thanks. Repeating the data frame and variable name is exactly what I want to avoid. But thanks for sleuthing out that `ggplot_build` kills the attributes we'd need. I also found no info on how to write a new scale. – Ruben Jan 25 '18 at 15:48
  • Updated with a different solution. – Axeman Jan 25 '18 at 16:16
  • This is actually pretty cool! So `ggplot_build` doesn't kill the attributes after all, but we cannot get them through `+`? – Ruben Jan 25 '18 at 16:39
  • It strips them in `data`, but not in `plot$data` it seems. The `+` issue is because `+` doesn't actually make the plot so far available to the function on the right hand side. – Axeman Jan 25 '18 at 16:42
0

Another approach is to write a wrapper for ggplot() that has its own class. Then attributes have full visibility when the corresponding print method is called. See ?ag.print from package 'yamlet' (0.2.1).

library(ggplot2)
library(yamlet)
library(magrittr)

mtcars$disp %<>% structure(label = 'displacement', unit = 'cu. in.')
mtcars$mpg %<>% structure(label = 'mileage', unit = 'miles/gallon')
mtcars$am %<>% factor(levels = c(0,1), labels = c('automatic','manual'))
mtcars$am %<>% structure(label = 'transmission')

agplot(mtcars, aes(disp, mpg, color = am)) + geom_point()

agplot of disp versus mpg

pgcudahy
  • 1,129
  • 7
  • 28