2

My question is two fold. First, given these three data frames:

df1 <- data.frame(k1 = runif(6, min=0, max=100), 
             k2 = runif(6, min=0, max=100), 
             k3 = runif(6, min=0, max=100), 
             k4 = runif(6, min=0, max=100))
df2 <- data.frame(k1 = runif(6, min=0, max=100), 
              k2 = runif(6, min=0, max=100), 
              k3 = runif(6, min=0, max=100), 
              k4 = runif(6, min=0, max=100))
df3 <- data.frame(k1 = runif(6, min=0, max=100), 
              k2 = runif(6, min=0, max=100), 
              k3 = runif(6, min=0, max=100), 
              k4 = runif(6, min=0, max=100))

I would like to reformat and rename part of each data frame using this function:

samplelist<-c("k2", "k4")

draft_fxn<-function(x, obj_name){
  x.selected<-x[,c(samplelist)] #select columns of choice
  colnames(x.selected)[1:2]<-paste(obj_name, colnames(x.selected), sep="_") #rename columns so they include original data frame name
  return(x.selected)
}

#Example run and output:
df2_final<-draft_fxn(df2, "df2")
#output from:
head(df2_final[1:2],)
>     df2_k2   df2_k4
>1  5.240274 53.03423
>2  5.042926 34.78974

First question: How can I change my function so I don't have to type in ' df2, "df2" '. In my draft_fxn code, I want to replace "obj_name" with whatever the name of the input data frame is. In my example it is "df2".

Second question: How can I loop through all of my data frames? Perhaps, similar to this for loop? objs<-c(df1, df2, df3)

for (file in objs){
  out<-draft_fxn(file); return(out)
} #this doesn't work though. 
shu251
  • 239
  • 3
  • 13
  • 1
    You would probably be better off [keeping a list of data.frames](http://stackoverflow.com/questions/17499013/how-do-i-make-a-list-of-data-frames) from the beginning rather than having a bunch of loosely related variables lying around with data in the variable name itself. – MrFlick May 10 '17 at 05:07

2 Answers2

3

To answer your first question: you can obtain the name of an object x using deparse(substitute(x)). So to eliminate the argument obj_name from your function, you could use

draft_fxn <- function(x){
    obj_name <- deparse(substitute(x))
    x.selected<-x[,c(samplelist)]
    colnames(x.selected)[1:2]<-paste(obj_name, colnames(x.selected), sep="_") #rename columns so they include original data frame name
    return(x.selected)
}  

As to your second question, if you wanted to perform such an operation for multiple data frames, you would usually put them in a list and then lapply the function. In this case, however, it does not work, because the object name changes if you put the data frames into a list, i.e. deparse(substitute(x)) returns X[[i]]_instead of the name of the individual data frame. If you wanted to do it in a loop I would suggest a different approach where you pass a vector of the names of the data frames:

## Names of the relevant data frames:
objNames <- c("df1", "df2", "df3")
## Function to rename the specified columns:
renameFun <- function(xString){
    x <- get(xString)[,c(samplelist)]
    colnames(x) <- paste(xString, samplelist, sep = "_")
    x   
}

## Apply function to all data frames specifed by objNames:
lapply(objNames, renameFun) 
# [[1]]
#      df1_k2    df1_k4
# 1 54.232123  2.178375
# 2 16.816784 23.586760
# 3  6.612874 16.509340
# 4 92.399588 71.133637
# 5 22.917838  8.127079
# 6 43.563411 21.118758
# 
# ...
ikop
  • 1,589
  • 10
  • 23
2

So your function is not well-specified because you're defining samplelist outside of the function and then calling it inside. The problem with that is that if you don't have samplelist defined, the function will return an error, i.e. it's not self-contained.

Here's an alternative:

  draft_fxn<-function(x, cols =...){
  x.selected<-data.frame(x[, cols]) #select columns of choice
  colnames(x.selected)<-paste(deparse(substitute(x)), colnames(x.selected), sep="_") #rename columns so they include original data frame name
  return(x.selected)
}

Note that the cols argument can vary (as long as it's positive and not larger than the number of columns in your data frame).

This returns:

> df2_final<- draft_fxn(df2, cols = c("k2", "k4"))
> head(df2_final)[1:2,]
    df2_k2    df2_k4
1 21.62533  2.256182
2 64.83556 67.705705
Yannis Vassiliadis
  • 1,682
  • 6
  • 13