I'm importing several .csvs that are all two columns wide (they're output from a program) - the first column is wavelength and the second is absorbance, but I'm naming it by the file name to be combined later like from this old stack overflow answer (Combining csv files in R to different columns). The incoming .csvs don't have headers, and I'm aware that the way I'm naming them crops the first data points. I would like for the first column to not have any decimals and standardize all of the numbers to four digits - the code I've added works on its own but not in this block - and I would prefer to do this formatting all in one go. I run into errors with $ not being the right operator, but when I use [] I get errors about that too. The column I need to do this to is the first and it's named 'Wavelength' - which also gives me errors either because wavelength doesn't exist or it's nonnumeric. Any ideas?
This is what my script currently looks like:
for (file in file_list) {
f <- sub("(.*)\\.CSV", "\\1", file)
assign(f, read.csv(file = file))
assign(f, setNames(get(f), c(names(get(f))[0:0], "Wavelength")))
assign(f, setNames(get(f), c(names(get(f))[1:1], file)))
floor(f[Wavelength]) #the issues are here
sprintf("%04d", f$Wavelength) #and here
}
The data looks like this in the csv before it gets processed:
1 401.7664 0.1379457
2 403.8058 0.1390427
3 405.8452 0.1421666
4 407.8847 0.1463629
5 409.9241 0.1477264
I would like the output to be:
Wavelength (file name)
1 0401 0.1379457
2 0403 0.1390427
3 0405 0.1421666
4 0407 0.1463629
5 0409 0.1477264
And here's the dput that r2evans asked for:
structure(list(X3.997270e.002 = c(401.7664, 403.8058, 405.8452,
407.8847, 409.9241, 411.9635), X1.393858e.001 = c(0.1379457,
0.1390427, 0.1421666, 0.1463629, 0.1477264, 0.1476971)), row.names =
c(NA,
6L), class = "data.frame")
Thanks in advance!
6/24 Update: When I assign the column name "Wavelength" it only gets added as a character, not as a real column name? When I dput/head the files once they go through (omitting the sprintf/floor functions) it only lists the file name (the second column). When I open the csvs in R studio the first column is properly labeled - and even further I'm able to combine all the csvs sorted by "Wavelength":
list_csvs <- mget(sub("(.*)\\.CSV", "\\1", file_list))
all_csvs <- Reduce(function(x, y) merge(x, y, all=T,
by=c("Wavelength")), list_csvs, accumulate=F)
Naturally I've thought about just formatting the column after this, but some of the decimals are off in the thousands place so I do need to format before I merge the csvs.
I've updated the code to use colnames outside of the read.csv:
for (file in file_list) {
f <- sub("(.*)\\.CSV", "\\1", file)
assign(f, read.csv(file = file,
header = FALSE,
row.names = NULL))
colnames(f) <- c("Wavelength", file)
print(summary(f))
print(names(f))
#floor("Wavelength") #I'm omitting this to see the console errors
#sprintf("%04.0f", f["Wavelength"]) #omitting this too
}
but I get the following error:
attempt to set 'colnames' on an object with less than two dimensions
Without the naming bit and without the sprintf/floor I get this back from the summary and names prompt for each file:
Length Class Mode
1 character character
NULL
When I try to call out the first column by f[1], f[[1]], f[,1], or f[[,1]] I get error messages about 'incorrect number of dimensions'. I can clearly see in the R environment that each data frame has a length of 2. I also double checked with .row_names_info(f)
that the first column isn't being read as row names. What am I doing wrong?