1

I have 30 plant species for which I have displayed the distributions of midday leaf water potential (lwp_md) using boxplots and the package ggplot2. But how do I group these species along the x-axis according to their leaf habits (e.g. Deciduous, Evergreen) as well as display a reference line indicating the mean lwp_md value for each leaf habit level?

I have attempted with the package forcats but really have no idea how to proceed with this one. I can't find anything after an extensive search online. The best I seem able to do is order species by some other function e.g. the median.

Below is an example of my code so far. Note I have used the packages ggplot2 and ggthemes:

library(ggplot2)
ggplot(zz, aes(x=fct_reorder(species, lwp_md, fun=median, .desc=T), y=lwp_md)) +
  geom_boxplot(aes(fill=leaf_habit)) +
  theme_few(base_size=14) +
  theme(legend.position="top", 
        axis.text.x=element_text(size=8, angle=45, vjust=1, hjust =1)) +
  xlab("Species") +
  ylab("Maximum leaf water potential (MPa)") +
  scale_y_reverse() +
  scale_fill_discrete(name="Leaf habit",
                      breaks=c("DEC", "EG"),
                      labels=c("Deciduous", "Evergreen"))

Here's a subset of my data including 4 of my species (2 deciduous, 2 evergreen):

> dput(zz)
structure(list(id = 1:20, species = structure(c(1L, 1L, 1L, 1L, 
1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L
), .Label = c("AMYELE", "BURSIM", "CASXYL", "COLARB"), class = "factor"), 
    leaf_habit = structure(c(2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 
    1L, 1L, 2L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 1L, 1L), .Label = c("DEC", 
    "EG"), class = "factor"), lwp_md = c(-2.1, -2.5, -2.35, -2.6, 
    -2.45, -1.7, -1.55, -1.4, -1.55, -0.6, -2.6, -3.6, -2.9, 
    -3.1, -3.3, -2, -1.8, -2, -4.9, -5.35)), class = "data.frame", row.names = c(NA, 
-20L))

An example of how I'm looking to display my data, cut and edited - I would like species on x-axis, lwp_md on y-axis: image

Z.Lin
  • 23,077
  • 5
  • 35
  • 71
taller
  • 13
  • 3
  • Does the answer to [this question](https://stackoverflow.com/q/35701663/8449629) suit your needs? (replacing median with mean) – Z.Lin Apr 01 '19 at 06:22
  • Yes I was specifically looking to display the mean. Probably should have edited my code to indicate that. – taller Apr 01 '19 at 15:28

1 Answers1

0

gpplot defaults to ordering your factors alphabetically. To avoid this you have to supply them as ordered factors. This can be done by arranging the data.frame and then redeclaring the factors. To generate the mean value we can use group_by and mutate a new mean column in the df, that can later be plotted.

Here is the complete code:

library(ggplot)
library(ggthemes)
library(dplyr)

zz2 <- zz %>% arrange(leaf_habit) %>%  group_by(leaf_habit) %>% mutate(mean=mean(lwp_md))
zz2$species <- factor(zz2$species,levels=unique(zz2$species))

ggplot(zz2, aes(x=species, y=lwp_md)) +
  geom_boxplot(aes(fill=leaf_habit)) +
  theme_few(base_size=14) +
  theme(legend.position="top", 
        axis.text.x=element_text(size=8, angle=45, vjust=1, hjust =1)) +
  xlab("Species") +
  ylab("Maximum leaf water potential (MPa)") +
  scale_y_reverse() +
  scale_fill_discrete(name="Leaf habit",
                      breaks=c("DEC", "EG"),
                      labels=c("Deciduous", "Evergreen")) +
  geom_errorbar(aes(species, ymax = mean, ymin = mean),
                size=0.5, linetype = "longdash", inherit.aes = F, width = 1)
Julian_Hn
  • 1,581
  • 1
  • 5
  • 14
  • Brilliant, it worked great! Your explanation was really clear and easy to learn from too. Thank you very much for your time Julian_Hn. – taller Apr 01 '19 at 14:00
  • If this helped you, please consider accepting the answer, so that others can learn from it too. – Julian_Hn Apr 01 '19 at 14:11