0

I need to plot Tukey's Test result which includes 8 groups. I am following R-graph gallery example where it works for four group. I tried to spent time modifying it 1. Changing the Tukey.levels <- TUKEY[[variable]][,4] to Tukey.levels <- TUKEY[[variable]][,7] because I have 8 groups. This gave me error Error in TUKEY[[variable]][, 7] : subscript out of bounds So I again use 4 2. Added more colors

But I am not able to get the plot which shows all the 28 comparisons on a box plot. Here is the data and my code

treatment <- rep(c("A", "B", "C", "D", "E","F","G","H"), each=20)
value=c( sample(2:5, 20 , replace=T) , sample(6:10, 20 , replace=T), sample(1:7, 20 , replace=T), sample(3:10, 20 , replace=T) , sample(10:20, 20 , replace=T),sample(11:15, 20 , replace=T),sample(6:12, 20 , replace=T),sample(3:3, 20 , replace=T) )
data=data.frame(treatment,value)
model=lm( data$value ~ data$treatment )
ANOVA=aov(model)
TUKEY <- TukeyHSD(x=ANOVA, 'data$treatment', conf.level=0.95)
plot(TUKEY , las=1 , col="brown")

generate_label_df <- function(TUKEY, variable){


  Tukey.levels <- TUKEY[[variable]][,4]
  Tukey.labels <- data.frame(multcompLetters(Tukey.levels)['Letters'])


  Tukey.labels$treatment=rownames(Tukey.labels)
  Tukey.labels=Tukey.labels[order(Tukey.labels$treatment) , ]
  return(Tukey.labels)
}
LABELS <- generate_label_df(TUKEY , "data$treatment")
my_colors <- c( 
  rgb(143,199,74,maxColorValue = 255),
  rgb(242,104,34,maxColorValue = 255), 
  rgb(111,145,202,maxColorValue = 255),
  rgb(255,0,0,maxColorValue = 255),
  rgb(0,255,255,maxColorValue = 255),
  rgb(128,0,0,maxColorValue = 255),
  rgb(128,0,128,maxColorValue = 255)
)
a <- boxplot(data$value ~ data$treatment , ylim=c(min(data$value) , 1.1*max(data$value)) , col=my_colors[as.numeric(LABELS[,1])] , ylab="value" , main="")
text( c(1:nlevels(data$treatment)) , a$stats[nrow(a$stats),]+over , LABELS[,1]  , col=my_colors[as.numeric(LABELS[,1])] )

If the groups are same then they should be represented by same color and same letter.

I also want to know how generate_label works. Do I always need to give value 4 even though I have 8 groups?

Bandana
  • 19
  • 1
  • 5

1 Answers1

1

If you are using R 4.0.0 or later, then one issue is that the data.frame function has changed its default behavior. Try adjusting the code inside your function, so that for each data.frame() call stringsAsFactors = TRUE. Also be sure that the multicompView library is loaded, which is not mentioned in your code:

treatment <- rep(c("A", "B", "C", "D", "E","F","G","H"), each=20)
set.seed(1) # for reproducability
value=c( sample(2:5, 20 , replace=T) , sample(6:10, 20 , replace=T), sample(1:7, 20 , replace=T), sample(3:10, 20 , replace=T) , sample(10:20, 20 , replace=T),sample(11:15, 20 , replace=T),sample(6:12, 20 , replace=T),sample(3:3, 20 , replace=T) )
data=data.frame(treatment,value, stringsAsFactors = T)
model=lm( data$value ~ data$treatment )
ANOVA=aov(model)
TUKEY <- TukeyHSD(x=ANOVA, 'data$treatment', conf.level=0.95)
plot(TUKEY , las=1 , col="brown")

library(multcompView) # load this library

generate_label_df <- function(TUKEY, variable){


  Tukey.levels <- TUKEY[[variable]][,4] # leave this value as 4. It is extracting the 4th column of the TUKEY object.
  Tukey.labels <- data.frame(multcompLetters(Tukey.levels)['Letters'], stringsAsFactors = TRUE)


  Tukey.labels$treatment=rownames(Tukey.labels)
  Tukey.labels=Tukey.labels[order(Tukey.labels$treatment) , ]
  return(Tukey.labels)
}
LABELS <- generate_label_df(TUKEY , "data$treatment")
my_colors <- c( 
  rgb(143,199,74,maxColorValue = 255),
  rgb(242,104,34,maxColorValue = 255), 
  rgb(111,145,202,maxColorValue = 255),
  rgb(255,0,0,maxColorValue = 255),
  rgb(0,255,255,maxColorValue = 255),
  rgb(128,0,0,maxColorValue = 255),
  rgb(128,0,128,maxColorValue = 255)
)
a <- boxplot(data$value ~ data$treatment , ylim=c(min(data$value) , 1.1*max(data$value)) , col=my_colors[as.numeric(LABELS[,1])] , ylab="value" , main="")

over <- 0.1*max( a$stats[nrow(a$stats),] )

text( c(1:nlevels(data$treatment)) , a$stats[nrow(a$stats),]+over , LABELS[,1]  , col=my_colors[as.numeric(LABELS[,1])] )

xilliam
  • 1,343
  • 1
  • 9
  • 19
  • I am sorry that I did not mention earlier that I had loaded (multicompView) and used over object. I am following the example in R graph gallery. The problem is I do have box plot but the color and letters are not the way I need. I am not using R version later 4. I feel it's something I am missing but I don't know what. – Bandana Jun 05 '20 at 15:59
  • I edited my answer. What happens when you run my code? And see the comments I've added in the code block, which address some of your question. – xilliam Jun 05 '20 at 23:20
  • It worked!! Thank you so much. Also, I wanted to ask is it possible to add jitter to this Tukey's test box plot? The only way I know is using a – Bandana Jun 06 '20 at 18:45
  • Try this for [jitter](https://stackoverflow.com/questions/23675735/how-to-add-boxplots-to-scatterplot-with-jitter#23677410) in a boxplot. – xilliam Jun 06 '20 at 23:41
  • Thank you again. – Bandana Jun 07 '20 at 19:19