0

I am trying to use loops to streamline some code so that I can analyze many sets of data within a folder without having to write new code for each set.

The goal of this code is to load in each .csv file from a folder into its own dataframe, named accordingly.

#Define the path to the folder containing the data sets
folder <- "Volumes/DataHD/Folder/"

#Make a list of the files within that folder
files <- list.files(path = folder)

#Define the desired names of each dataframe
names <- c("A1", "A2", "A3")

#Set working directory to that folder
setwd(folder)

#Use a for loop to load each .csv file into its own dataframe
i = 1

for(theFile in files){
    name[i] <- read.csv(theFile)
    i = i + 1 
}

However, rather than creating three dataframes with the names "A1", "A2", and "A3", this code just changes the contents of the "names" list so that each object in the list is one of my desired dataframes.

I realize now that these attempted work arounds were foolish, but I have also tried:

i = 1

for(theFile in files){
    toString(name[i]) <- read.csv(theFile)
    i = i + 1 
}

which gives the error "could not find function "toString<-"". And:

i = 1

for(theFile in files){
    c <- toString(name[i]) 
    c <- read.csv(theFile)
    i = i + 1 
}

which just changes c into a dataframe. Historically, I would just do something like:

"A1" <- read.csv("Volumes/DataHD/Folder/LongNameA1.csv"
"A2" <- read.csv("Volumes/DataHD/Folder/LongNameA2.csv"
"A3" <- read.csv("Volumes/DataHD/Folder/LongNameA3.csv"

But the actual scenario involves many sets of data and having to constantly retype or copy paste is exactly what I am trying to avoid. Is there any way to accomplish what I'm trying to do? Or should I take a totally different approach and try to tackle it with arrays of some kind?

Edit: Each desired dataframe has a different number of rows, just in case that effects your advice.

Sarah
  • 3
  • 2
  • I don't know r but it sounds like you need a collection of dataframes. An array or list would probably do. – Felix Mar 05 '20 at 00:13
  • you need `assign`, `assign(name[i], read.csv(theFile))`. – Ronak Shah Mar 05 '20 at 00:14
  • @RonakShah thanks so much! To get this to work I had to do assign(toString(name[i]), read.csv(theFile)), but the assign function is exactly what I needed! – Sarah Mar 05 '20 at 00:25
  • To simplify your code, [use a list of data frames](https://stackoverflow.com/a/24376207/903061). – Gregor Thomas Mar 06 '20 at 19:35

2 Answers2

0

You may want to use assign() function from base.

for(theFile in files){
  assign(name[i], read.csv(theFile))
  i = i + 1 
}

Alternatively, you may want to use lapply() with setNames() for each file, then use list2env() to set to your environment. This eliminates the need for the for() loop and your i counter.

list2env(lapply(setNames(files, names), 
         read.csv), envir = .GlobalEnv)

If you find it more readable, you can also write this with piping.

setNames(files, names) %>% lapply(., read.csv)  %>% list2env(., envir = .GlobalEnv)
SEAnalyst
  • 582
  • 3
  • 11
  • Thanks so much! To get this to work I had to do assign(toString(name[i]), read.csv(theFile)), but the assign function is exactly what I needed! – Sarah Mar 05 '20 at 00:27
  • I wonder what made toString() necessary in your case? The example 'names' that you provided worked fine on my end. – SEAnalyst Mar 05 '20 at 00:45
0

I think a cleaner way to accomplish would be lapply and read.csv

#Define the path to the folder containing the data sets
folder <- "Volumes/DataHD/Folder/"

#Make a list of the files within that folder
files <- list.files(path = folder)

#Define the desired names of each dataframe
names <- c("A1", "A2", "A3")

#Set working directory to that folder
setwd(folder)

# read all files
df_list <- lapply(files, read.csv)

# set the names of your list
names(df_list) <- names

You then have one global variable df_list that has all of our data frames in it. This makes further operations on the data frames easier to implement.

Mxblsdl
  • 334
  • 3
  • 13