1

I'm trying to make a chart of the frequency of the items in a dataset but I'm having this error when ploting.

Don't know how to automatically pick scale for object of type data.frame. Defaulting to continuous.

Error: Aesthetics must be either length 1 or the same as the data (39123): x

This is the model of the data I'm using:

  product_id  order_id                  product_name
1   41899       330425    Oreo  Ice Cream Sandwiches
2  122580      1707573                     Mint Chip
3  146891       622568              Coffee Ice Cream
4  134292      1284843 Belgian Milk Chocolate Gelato
5  146530      2693694   Variety Pack Ice Cream Bars

The str output of the dataframe is (top 5 values):

'data.frame':   5 obs. of  3 variables:
 $ product_id  : int  41899 122580 146891 134292 146530
 $ order_id    : int  330425 1707573 622568 1284843 2693694
 $ product_name: Factor w/ 49688 levels "'Swingtop' Premium Lager",..: 28573 25871 10030 4236 47274

I've tried several changes to the plot code, but I got different errors.

This is the code I'm using.

orders_group <- group_by(orders_products,order_id)
orders_summ <- as.data.frame(summarise(orders_group, n_items = count(product_name)))

ggplot(orders_summ,aes(x=n_items))+
  geom_histogram(stat="count")+#geom_histogram(fill="indianred", bins = 100000) + 
  geom_rug()+
  coord_cartesian(xlim=c(0,80))+
  scale_fill_manual(values = getPalette(colourCount))
Community
  • 1
  • 1
Rednaxel
  • 828
  • 2
  • 14
  • 31
  • 1
    The error message seems to be complaining about the type, or structure, of the data objects you're dealing with. In that circumstance, it would also make sense to share with use the output of `str()` on the various objects you're using so that we can actually _see_ what the issue might be. Even better would be a completely reproducible example that we can run ourselves. – joran Apr 08 '18 at 22:03
  • I added the example of the data I'm using. – Rednaxel Apr 08 '18 at 22:18
  • Thank you, but that actually doesn't help any at all. I mentioned `str()` specifically because merely printing objects in R to the console provides only very limited information about their structure. And again, a completely reproducible example would be ideal. – joran Apr 08 '18 at 22:21
  • I updated top 5 values I'm using and the results of the 'str()' function. I got the same error with this 5 values. I even tried changing the product_name with the product_id and got the same error. – Rednaxel Apr 08 '18 at 22:30

1 Answers1

1

I believe this stems from the fact that you are using count() incorrectly. count() produces a tibble, which will react strangely when inside the dataframe resulting from your call to summarise.

I have to run, but at first glance it looks like you created a column of dataframes (or something like that), which would explain your ggplot error. I believe what you are looking for is this:

orders_summ <- orders_products %>%
      group_by(order_id) %>% # normally this step would have produced your orders_group
      summarise(n_items = n())

Then try and run your ggplot code.

Marcus Campbell
  • 2,558
  • 4
  • 21
  • 34
  • Thanks, I had a little problem with the ´n()´ function but could solved it with https://stackoverflow.com/questions/22801153/dplyr-error-in-n-function-should-not-be-called-directly – Rednaxel Apr 09 '18 at 09:22