1

I got error for apply count_ to all my dataframes. I can manually apply to single dataframe, but when I tried lapply, it showed error

Error in UseMethod("groups") :
  no applicable method for 'groups' applied to an object of class "character"

I want to find the unique pairs of longitude and latitute in my dataset. For single dataframe I used dplyr::count_(d, vars = c('longitude','latitute ')), which return a table of pairs of values and the count number. I want to check the unique pairs in each of dataframe and store them in separate files. Currently I tried put all my dataframes in a list and used for loop.

For single data frame, I used

dplyr::count_(CA, vars = c('locationlongitude','locationlatitude'))
###it returns output like this

   locationlongitude locationlatitude     n
                <dbl>            <dbl> <int>
 1             -72.0             42.6    47
 2             -72.0             42.6    69
 3             -71.8             42.6    59
 4             -71.7             42.5    93
 5             -71.7             42.5    65

Then I want to apply to all my data frames

for (i in files) {
    nam <- paste("B_", i)
    assign(nam, dplyr::count_(i, vars = c('locationlongitude', 'locationlatitude')))
}  

files is a list of all my dataframes' name and I expected created dataframes begin with B_+dataframesname to store my unique locations from each data frame. But there is

Error  in UseMethod("groups") : 
  no applicable method for 'groups' applied to an object of class "character". 

I also tried to create files as a list and each elements will be the dataframe but I got another error when doing that:

Error in assign(nam, dplyr::count_(i, vars = c("locationlongitude", "locationlatitude"))) :
  variable names are limited to 10000 bytes
In addition: Warning message: In assign(nam, dplyr::count_(i, vars = c("locationlongitude", "locationlatitude"))) :
  only the first element is used as variable name

I believe there should be a efficient way to apply function to multiple data frame and return another data frame. But I'm stuck. I'd appreciated any comments!

alistaire
  • 38,696
  • 4
  • 60
  • 94
Hui
  • 25
  • 3
  • 3
    Use [a list of data frames](https://stackoverflow.com/a/24376207/4497050) or a nested list-column of a data frame, or just a single bigger data frame with another grouping column. But don't use `for` to iterate `assign`. And the functions suffixed with `_` have been deprecated in favor of tidy eval, but that's additional complication you can ignore for now. – alistaire Jan 05 '19 at 19:48
  • Thanks for your comment. I changed code and use a list of data frames as list_df – Hui Jan 05 '19 at 20:41
  • I used list of df and lapply, it works. Thank you! But could you explain a little why the loop and assign is not recommended? – Hui Jan 05 '19 at 20:46
  • 1
    The key is not to litter your global environment with data frames, but to store them all in a list from the start, e.g. with `list_of_data_frames – alistaire Jan 05 '19 at 20:56
  • 1
    As for why _not_ to use `for` and `assign`, there are simpler ways to write code that are less likely to have unintended consequences. `for` is almost never necessary in R, and requires extra attention to use well, with initialization and preallocation. `lapply` takes care of this for you, though it's not inherently faster. When possible, the best option is to vectorize everything. `assign` is a function that is only appropriately used in the context of operating on the language itself, which is advanced R programming. Outside of that context, it makes code difficult to read. – alistaire Jan 05 '19 at 21:08

0 Answers0