1

I'm sure this is stupidly simple but it's been some time since I've used R, and I have never made a barchart with ggplot.

I have the following averages from a larger dataset:

> mean_gc
PVC1      PVC2      PVC3      PVC4      PVC5      PVC6      PVC7      PVC8      PVC9     PVC10     PVC11 
0.4019026 0.4479259 0.4494118 0.4729437 0.4800556 0.4492290 0.4905295 0.4457566 0.4271259 0.4850341 0.4369965 
PVC12     PVC13     PVC14     PVC15     PVC16 
0.4064052 0.3743776 0.3603853 0.3965469 0.3654610

My end goal is to plot a bar chart (since each "PVC#" is discrete), and fit a step-function across it in R to try and find subtle 'breakpoints' - but that's a problem for later...

The only way I've been able to achieve a barplot from this is using barplot which creates the graph below.

Which is fine, but it's ugly compared to ggplot.

I've tried setting the above data as a dataframe both with the PVC labels in the dataframe, and as rownames - but I just can't get the syntax right and I'm at my wits end!

What am I missing?

Demo barplot

EDIT FOR CLARITY ON DATAFRAMES

The above was just the printed output in R (not the best way to show it - my apologies). I have the data in the following (column based format):

    mean_gc
PVC1  0.4019026
PVC2  0.4479259
PVC3  0.4494118
PVC4  0.4729437
PVC5  0.4800556
PVC6  0.4492290
PVC7  0.4905295
PVC8  0.4457566
PVC9  0.4271259
PVC10 0.4850341
PVC11 0.4369965
PVC12 0.4064052
PVC13 0.3743776
PVC14 0.3603853
PVC15 0.3965469
PVC16 0.3654610

Where PVC# are the row.names. I also have the same dataset where the row.names are present as the first column, in case that is required (but I suspect not).

Joe Healey
  • 1,014
  • 2
  • 13
  • 29
  • This needs a reproducible example, but it's off-topic here. Please study advice in the Help Center on software-specific questions. – Nick Cox Apr 12 '16 at 15:17
  • ggplot works with long data, not wide data. Use `reshape2::melt` or `tidyr:;gather` (or something similar) to make your data long. – Gregor Thomas Apr 12 '16 at 18:20
  • Ah I'd read about steps such as that. Is that strictly for barplots? because when I've made `ggplot` scatters in the past, I've achieved that with just a couple of `cbind`-ed vectors - and as far as I can tell (ignoring the fact that I've pasted them horizontally above bcause thats just R's printout), my data when in 2-column format are indistinguishable in format from that data I've used in the past, except the x values here are discrete whereas in my previous data sets they've been continuous. – Joe Healey Apr 12 '16 at 18:35
  • There's doubtless dupes out there somewhere, [I have this old one](http://stackoverflow.com/q/7910594/903061) but it doesn't have such a wide variety of answers. Any other suggestions? Maybe [this one](http://stackoverflow.com/q/21236229/903061) – Gregor Thomas Apr 12 '16 at 18:37
  • Nothing is special about barplots. `ggplot` likes for you to map a column to a dimension. Here you have PVC category mapped to the x-dimension, so you want that in a column. – Gregor Thomas Apr 12 '16 at 18:38
  • Though maybe I'm misunderstanding your paste. It looks like a data frame with 1 row and many columns, which would require reshaping. It's much better if you share data reproducibly with `dput(mean_gc)`, then there is no ambiguity about structure and class, and it can by copy/pasted into R. – Gregor Thomas Apr 12 '16 at 18:40

1 Answers1

2

You have to melt() your data before you can use ggplot2, because it assumes a tidy data structure.

library(reshape2)
library(ggplot2)
ggplot(melt(df), aes(variable, value)) + 
        geom_bar(stat = "identity")

enter image description here

Data

df <- structure(list(PVC1 = 0.4019026, PVC2 = 0.4479259, PVC3 = 0.4494118, 
    PVC4 = 0.4729437, PVC5 = 0.4800556, PVC6 = 0.449229, PVC7 = 0.4905295, 
    PVC8 = 0.4457566, PVC9 = 0.4271259, PVC10 = 0.4850341, PVC11 = 0.4369965, 
    PVC12 = 0.4064052, PVC13 = 0.3743776, PVC14 = 0.3603853, 
    PVC15 = 0.3965469, PVC16 = 0.365461), .Names = c("PVC1", 
"PVC2", "PVC3", "PVC4", "PVC5", "PVC6", "PVC7", "PVC8", "PVC9", 
"PVC10", "PVC11", "PVC12", "PVC13", "PVC14", "PVC15", "PVC16"
), class = "data.frame", row.names = c(NA, -1L))
mtoto
  • 21,499
  • 2
  • 49
  • 64