5

For the following data set,

Genre   Amount
Comedy  10
Drama   30
Comedy  20
Action  20
Comedy  20
Drama   20

I want to construct a ggplot2 line graph, where the x-axis is Genre and the y-axis is the sum of all amounts (conditional on the Genre).

I have tried the following:

p = ggplot(test, aes(factor(Genre), Gross)) + geom_point()
p = ggplot(test, aes(factor(Genre), Gross)) + geom_line()
p = ggplot(test, aes(factor(Genre), sum(Gross))) + geom_line()

but to no avail.

csgillespie
  • 54,386
  • 13
  • 138
  • 175
Julio Diaz
  • 7,923
  • 18
  • 47
  • 68

2 Answers2

8

If you don't want to compute a new data frame before plotting, you cvan use stat_summary in ggplot2. For example, if your data set looks like this :

R> df <- data.frame(Genre=c("Comedy","Drama","Action","Comedy","Drama"),
R+                  Amount=c(10,30,40,10,20))
R> df
   Genre Amount
1 Comedy     10
2  Drama     30
3 Action     40
4 Comedy     10
5  Drama     20

You can use either qplot with a stat="summary" argument :

R> qplot(Genre, Amount, data=df, stat="summary", fun.y="sum")

Or add a stat_summary to a base ggplot graphic :

R> ggplot(df, aes(x=Genre, y=Amount)) + stat_summary(fun.y="sum", geom="point")
Gavin Simpson
  • 157,540
  • 25
  • 364
  • 424
juba
  • 43,082
  • 11
  • 100
  • 113
  • Neat one-liner... though you can easily ommit `factor`, since `stringsAsFactors` is the default behaviour. – aL3xa Mar 07 '11 at 09:25
  • I think I'll let the factor() instruction because it is used in the question, but you're right, it is not useful here. Thanks for pointing it. – juba Mar 07 '11 at 09:35
  • Thanks so much, the reason I was using factor was because I was trying to get the sum from lower to higher, but it does not do that. – Julio Diaz Mar 07 '11 at 09:50
  • Ok, so I finally removed it :) – juba Mar 07 '11 at 10:21
  • @juba, is there anyway to order the bars according to the y value, which in this case is the sum? – Julio Diaz Mar 07 '11 at 10:25
  • What does "from lower to higher" mean? Maybe you were refering to ordered factors? – aL3xa Mar 07 '11 at 10:41
  • @Julio Diaz on the bar ordering, Yes, see this SO question: http://stackoverflow.com/q/5208679/429846 – Gavin Simpson Mar 07 '11 at 10:44
  • 2
    it seems you can use a `reorder` call in your `aes` definition, something like ` aes(x=reorder(Genre, Amount, sum), y=Amount))`. But there may be a better and cleaner way to do it. – juba Mar 07 '11 at 10:55
  • Where is the full documentation for the stat_ methods? the ggplot bok hardly touches this, yet they are clearly pwerful and useful. – Alex Brown Mar 12 '11 at 06:32
1

Try something like this:

dtf <- structure(list(Genre = structure(c(2L, 3L, 2L, 1L, 2L, 3L), .Label = c("Action", 
"Comedy", "Drama"), class = "factor"), Amount = c(10, 30, 20, 
20, 20, 20)), .Names = c("Genre", "Amount"), row.names = c(NA, 
-6L), class = "data.frame")

library(reshape)
library(ggplot2)
mdtf <- melt(dtf)
cdtf <- cast(mdtf, Genre ~ . , sum)
ggplot(cdtf, aes(Genre, `(all)`)) + geom_bar()
aL3xa
  • 32,399
  • 18
  • 76
  • 111
  • Did you automatically generate your structure() instruction from the example provided in the question ? If yes, I'd be very happy to know how :-) – juba Mar 07 '11 at 09:33
  • No, I entered it by hand, hence applied `dput` on it. – aL3xa Mar 07 '11 at 09:41
  • But you can use `read.clipboard` function from `psych` package. It works like a charm: `dtf – aL3xa Mar 07 '11 at 09:56
  • Ah yes, great. For me selecting the data and `read.table("clipboard",header=TRUE)` does the trick. Thanks ! – juba Mar 07 '11 at 10:18
  • 2
    You can also use `?textConnection` in conjunction with `read.table`. There's been some examples of that here on SO, e.g. http://stackoverflow.com/questions/4881149/r-list-row-name – Roman Luštrik Mar 07 '11 at 10:53
  • See `text_to_table`, here: http://stackoverflow.com/questions/3936285/is-there-a-way-to-use-read-csv-to-read-from-a-string-value-rather-than-a-file-in/3941145#3941145 – Richie Cotton Mar 07 '11 at 15:56