I have hundreds of *csv files. I would like to crunch some summary statistics for each one, and then record these statistics in a single dataframe/csv file, with each row from one csv.
Let's say it's the following data frame from base R
> mtcars
mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160.0 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160.0 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108.0 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258.0 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360.0 175 3.15 3.440 17.02 0 0 3 2
....
I might one to record the mean of mpg
, i.e. mean(mtcars$mpg)
is 20.09062
.
The row for the resulting data frame would be
mean_mpg max_mpg ...
mtcars 20.09 33.9 ...
df2 232 92.7 ...
I know how to glob
all of the *csv files in a certain path together, files = Sys.glob("*.csv")
for file in files:
df = read.csv(file)
mean = mean(df$mpg)
....
Now, I'm stuck. How do I write these values into a row for a giant summary csv?
(Sorry for the n00b question, but I'm a bit lost)