Removing rows of all data frames in global environment

Question

I have a few hundred .csv's (here is a sample of 3 of these files) I would like to:

Import into R
Trim to a common row length
Extract a specific column
Combine into one data frame with object name as column name.

In my case I can get the files in fine using:

temp = list.files(pattern="*.csv")
list2env(
lapply(setNames(temp, make.names(gsub("*.csv$", "", temp))), 
read.csv), envir = .GlobalEnv)}

but after trying a few different suggestions I can't see how to loop and trim the data frames to 25,000 observations each, and extract the 'data.Activity' column for each.

Ideally I would end up with an object 25,000 observations, and each of the 'data.Activity' variables from the supplied datasets named as their 'Clinstag_XX' ID.

Any help would be much appreciated.

This question is too vague to answer. However, here's a hint: you should keep them in a list rather than use `list2env`. This way, you can again use `lapply` on all of the dataframes to "trim" them to a common row length, etc. See gregor's answer [here](https://stackoverflow.com/questions/17499013/how-do-i-make-a-list-of-data-frames) for more motivating examples. — lmo, Nov 03 '17 at 12:23
@Imo Thank you, I scrapped the list2env and used lapply which worked great. — Jonathon P, Nov 04 '17 at 00:35

score 0 · Answer 1 · answered Nov 03 '17 at 12:24

Here's an example:

foo <- data.frame(a = 1:3, data.Activity = runif(3))
bar <- data.frame(data.Activity = runif(5), b = letters[1:5])
rows <- 2:3
do.call(cbind, lapply(ls(pattern="foo|bar"), function(dfname) {
  setNames(get(dfname)[rows, "data.Activity", drop=F], dfname)
}))
# bar        foo
# 2 0.4387158 0.08075924
# 3 0.3078052 0.92174396

score 0 · Accepted Answer · answered Nov 03 '17 at 12:38

You can always work with data tables and do each operation in a for loop.

Something like that.

temp = list.files(path = 'data', pattern = "*.csv")


datatable <- read.csv(file = paste0("data/",temp[1])) %>% as.data.table()
datatable <- datatable[1:2500, names(datatable)[names(datatable)%like%"Activity|Date"], with = F]


for(file in temp[2:length(temp)]){
temp1 <- read.csv(file = paste0("data/",file)) %>% as.data.table()
temp11 <- temp1[1:2500, names(temp1)[names(temp1)%like%"Activity|Date"], with = F]
datatable <- rbind(datatable, temp11, fill = T)
}

You can adjust the row number of the variables that you want to extract. Note: I have put the csv files in a folder called data.

Thanks! after some minor tweaking (like cbind instead or rbind) I have worked your suggestion. Learnt a lot from your example. Thanks again for your effort. — Jonathon P, Nov 04 '17 at 00:33

Removing rows of all data frames in global environment

2 Answers2