The data frame has many continuous numeric columns (e.g. gr
) and a sample identifier - wellseq
. There are many rows of data for each wellseq
. In a data frame - there are 94 distinct levels of wellseq
in 10227 rows. Header lines from data frame are:
gr wellseq
1 27.7049 1
2 31.1149 1
3 34.5249 1
4 39.7249 1
5 44.9249 1
6 50.1299 1
Summary of column gr
is as below:
summary(GR)
gr
Min. :-6.94
1st Qu.:10.71
Median :13.76
Mean :18.99
3rd Qu.:20.70
Max. :98.42
NA's :55
Basic Histogram of the entire data for the gr
is suitably created. For further analysis, it is required to identify each wellseq
contributing in the histogram. The ggplot()
script used is:
p2 <- ggplot() + theme_bw() +
geom_histogram(data=GR, na.rm= TRUE, mapping = aes(x=gr, fill=factor(GR$wellseq)),
bins = 10) + scale_color_brewer(palette = "Dark2") +
scale_x_continuous(limits = c(-10, 100)) +
labs(title=paste("Gamma Ray","Histogram", sep=" ")) +
theme(legend.position = "none")
The resulting output has color - which is "sequential" and NOT the "qualitative" palette "Dark2". I tried using the answer in "How to generate a number of most distinctive colors in R?" @ stackoverflow.com and created required colors.
Dcolor = grDevices::colors()[grep('gr(a|e)y', grDevices::colors(), invert = T)]
DcolorR <- sample(Dcolor, 433, replace = F)
using scale_colour_manual(values = DcolorR)
gives the same histogram. Using ..count..
for y
the histogram does show the boundaries for different wellseq
but does not fill as needed.
p3 <- ggplot() + theme_bw() +
geom_histogram(data=GR, na.rm= TRUE, mapping = aes(x=gr, y= ..count.., col = factor(GR$wellseq), bins = 10)) +
scale_colour_manual(values = DcolorR) +
scale_x_continuous(limits = c(-10, 100)) +
labs(title=paste("Gamma Ray"," Frequency Histogram", sep=" ")) +
theme(legend.position = "none")
fill = 1 # leads to blue colored staked histogram
I am trying to get a plot like attached. Please guide. Thanks in advance.