2

I'm looking for a way to apply a function to either specified labels, or to all labels that are included in the plot. The goal is to have neat human readable labels that derive from the default labels, without having to specify each.

To demonstrate what I am looking for in terms of the input variable names and the output, I am including an example based on the starwars data set, that uses the versatile snakecase::to_sentence_case() function, but this could apply to any function, including ones that expand short variable names in pre-determined ways:

library(tidyverse)
library(snakecase)

starwars %>%
  filter(mass < 1000) %>%
  mutate(species = species %>% fct_infreq %>%  fct_lump(5) %>% fct_explicit_na) %>%
  ggplot(aes(height, mass, color=species, size=birth_year)) +
  geom_point() +
  labs( 
    x = to_sentence_case("height"),
    y = to_sentence_case("mass"),
    color = to_sentence_case("species"),
    size  = to_sentence_case("birth_year")
  )

Which produces the following graph:

Star Wars Plot

The graph is the desired output, but requires that each of the labels be specified by hand, increasing the possibility of error if the variables are later changed. Note that if I had not specified the labels, all the labels would have been applied automatically, but with the variable names instead of the prettier versions.

This issue seems to be somewhat related to what the labeller() function is intended for, but it seems that it only applies to facetting. Another related issue is raised in this question. However, both of these seem to apply only to values contained within the data, not to the variable names that are being used in the plot, which is what I am looking for.

Z.Lin
  • 23,077
  • 5
  • 35
  • 71
Magnus
  • 19,360
  • 1
  • 25
  • 24

3 Answers3

2

The very helpful answer by @z-lin demonstrated to me a simple way to do this by simply modifying the plot object before printing.

The intended result can be achieved with the help of gg_apply_labs(), a short function that will apply an arbitrary string processing function to the $labels of a plot object. The resulting code should be a self-contained illustration of this approach:

# Packages    
library(tidyverse)
library(snakecase)

# This applies fun to each label present in the plot object
#
# fun should accept and return character vectors, it can either be a simple
# prettyfying function or it can perform more complex lookup to replace 
# variable names with variable labels
gg_apply_labs <- function(p, fun) {
  p$labels <- lapply(p$labels, fun)
  p
}

# This gives the intended result
# Note: The plot is assigned to a named variable before piping to apply_labs()
p <- starwars %>%
  filter(mass < 1000) %>%
  mutate(species = species %>% fct_infreq %>%  fct_lump(5) %>% fct_explicit_na) %>%
  ggplot(aes(height, mass, color=species, size=birth_year)) +
  geom_point()
p %>% gg_apply_labs(to_sentence_case)

# This also gives the intended result, in a single pipeline
# Note: It is important to put in the extra parentheses!
(starwars %>%
  filter(mass < 1000) %>%
  mutate(species = species %>% fct_infreq %>%  fct_lump(5) %>% fct_explicit_na) %>%
  ggplot(aes(height, mass, color=species, size=birth_year)) +
  geom_point()) %>% 
  gg_apply_labs(to_sentence_case)

# This DOES NOT give the intended result
# Note: The issue is probably order precedence
starwars %>%
  filter(mass < 1000) %>%
  mutate(species = species %>% fct_infreq %>%  fct_lump(5) %>% fct_explicit_na) %>%
  ggplot(aes(height, mass, color=species, size=birth_year)) +
  geom_point() %>% 
  gg_apply_labs(to_sentence_case)
Magnus
  • 19,360
  • 1
  • 25
  • 24
1

A simple solution is to pipe through rename_all (or rename_if if you want more control) before plotting:

library(tidyverse)
library(snakecase)

starwars %>%
  filter(mass<1000) %>%
  mutate(species=species %>% fct_infreq %>%  fct_lump(5) %>% fct_explicit_na) %>%
  rename_all(to_sentence_case) %>%
  #rename_if(is.character, to_sentence_case) %>% 
  ggplot(aes(Height, Mass, color=Species, size=`Birth year`)) +
  geom_point()
#> Warning: Removed 23 rows containing missing values (geom_point).

Created on 2019-11-25 by the reprex package (v0.3.0)

Note, though, that the variables given to aes in ggplot in this case must be modified to match the modified sentence case variable names.

MSR
  • 2,144
  • 1
  • 9
  • 21
  • Thanks, this is very useful. It does mean that the user must remember (and update) the final labelled version, which can be tricky, especially if the function is using some sort of lookup (for example, replacing cgdp with "GDP per capita"). So having the replacement occur at the end of the plotting "pipeline", rather than before, would be preferable. But it is a clear improvement. – Magnus Nov 25 '19 at 17:27
1

You can modify a ggplot object's appearance at the point of printing / plotting it, without affecting the original plot object, using trace:

trace(what = ggplot2:::ggplot_build.ggplot,
      tracer = quote(plot$labels <- lapply(plot$labels,
                                           <whatever string function you desire>)))

This will change the appearance of all existing / new ggplot objects you wish to plot / save, until you turn off the trace via either untrace(...) or tracingState(on = FALSE).

Illustration

  1. Create a normal plot with default labels in lower case:
library(tidyverse)

p <- starwars %>%
  filter(mass < 1000) %>%
  mutate(species=species %>% fct_infreq %>%  fct_lump(5) %>% fct_explicit_na) %>%
  ggplot(aes(height, mass, color=species, size=birth_year)) +
  geom_point() +
  theme_bw()

p # if we print the plot now, all labels will be lower-case

Apply a function to modify the appearance of all labels:

trace(what = ggplot2:::ggplot_build.ggplot,
      tracer = quote(plot$labels <- lapply(plot$labels,
                                           snakecase::to_sentence_case)))
p # all labels will be in sentence case

trace(what = ggplot2:::ggplot_build.ggplot,
      tracer = quote(plot$labels <- lapply(plot$labels, 
                                           snakecase::to_screaming_snake_case)))
p # all labels will be in upper case

trace(what = ggplot2:::ggplot_build.ggplot,
      tracer = quote(plot$labels <- lapply(plot$labels, 
                                           snakecase::to_random_case)))
p # all letters in all labels may be in upper / lower case randomly
  # (exact order can change every time we print the plot again, unless we set the same
  # random seed for reproducibility)

trace(what = ggplot2:::ggplot_build.ggplot,
      tracer = quote(plot$labels <- lapply(plot$labels, 
                                           function(x) paste("!!!", x, "$$$"))))
p # all labels now have "!!!" in front & "$$$" behind (this is a demonstration for
  # an arbitrary user-defined function, not a demonstration of good taste in labels)

Toggle between applying & not applying the function:

tracingState(on = FALSE)
p # back to sanity, temporarily

tracingState(on = TRUE)
p # plot labels are affected by the function again

untrace(ggplot2:::ggplot_build.ggplot)
p # back to sanity, permanently
Z.Lin
  • 23,077
  • 5
  • 35
  • 71
  • Thank you for this answer, it's really helpful! Reading it, I learned quite a bit about both tracing and about ggplot objects. Ultimately, I decided to put together my own answer, using your approach of modifying the plot object after creating it, except that rather than having it be done completely automatically, I call a function explicitly to do the processing. – Magnus Nov 26 '19 at 11:42