0

I have 104 trained linear models built using caret package.It took so long to run all the models. So the models are named lmfit_r1c1,lmfit_r2c1,lmfit_r3c1........lmfit_r8c1,lmfit_r1c2,lmfit_r2c2.......lmfit_r8c13.

The class of the model is train

> class(lmfit_r1c1)
[1] "train"

When I print a model(say lmfit_r1c1), I get results as follows

> lmfit_r1c1

Linear Regression 

30776 samples
  208 predictor

Pre-processing: centered (208), scaled (208) 
Resampling: Cross-Validated (5 fold) 
Summary of sample sizes: 24621, 24622, 24621, 24619, 24621 
Resampling results:

  RMSE       Rsquared   MAE       
  0.2327991  0.8447337  0.05227046

Tuning parameter 'intercept' was held constant at a value of TRUE

I can extract RMSE and Rsquared values by using

>lmfit_r1c1$results[c("RMSE","Rsquared")]
       RMSE  Rsquared
1 0.2327991 0.8447337

I want to do this for all the models by running a for loop 104 times.

I want to assign each trained model to a temporary variable and then extract RMSE and Rsquared from it and store them in a new data frame with two columns for RMSE and R-squared.I should end up with a dataframe of dimension 104*2.

I have a vector v that contains

>v
  [1] "r1c1"  "r2c1"  "r3c1"  "r4c1"  "r5c1"  "r6c1"  "r7c1"  "r8c1"  "r1c2"  "r2c2"  "r3c2" 
 [12] "r4c2"  "r5c2"  "r6c2"  "r7c2"  "r8c2"  "r1c3"  "r2c3"  "r3c3"  "r4c3"  "r5c3"  "r6c3" 
 [23] "r7c3"  "r8c3"  "r1c4"  "r2c4"  "r3c4"  "r4c4"  "r5c4"  "r6c4"  "r7c4"  "r8c4"  "r1c5" 
 [34] "r2c5"  "r3c5"  "r4c5"  "r5c5"  "r6c5"  "r7c5"  "r8c5"  "r1c6"  "r2c6"  "r3c6"  "r4c6" 
 [45] "r5c6"  "r6c6"  "r7c6"  "r8c6"  "r1c7"  "r2c7"  "r3c7"  "r4c7"  "r5c7"  "r6c7"  "r7c7" 
 [56] "r8c7"  "r1c8"  "r2c8"  "r3c8"  "r4c8"  "r5c8"  "r6c8"  "r7c8"  "r8c8"  "r1c9"  "r2c9" 
 [67] "r3c9"  "r4c9"  "r5c9"  "r6c9"  "r7c9"  "r8c9"  "r1c10" "r2c10" "r3c10" "r4c10" "r5c10"
 [78] "r6c10" "r7c10" "r8c10" "r1c11" "r2c11" "r3c11" "r4c11" "r5c11" "r6c11" "r7c11" "r8c11"
 [89] "r1c12" "r2c12" "r3c12" "r4c12" "r5c12" "r6c12" "r7c12" "r8c12" "r1c13" "r2c13" "r3c13"
[100] "r4c13" "r5c13" "r6c13" "r7c13" "r8c13"

I tried something like this:

new_df <- data.frame(matrix(nrow = 104,ncol=2))
colnames(new_df)<-c("RMSE","R-squared")
for (i in 1:104) {
        assign(fit,paste0("lmfit_",v[i]))
        new_df[i,] <- fit$results[c("RMSE","Rsquared")]
}

Obviously this didn't work because paste0 returns character datatype.

Any help would be appreciated.

Mohamad Sahil
  • 111
  • 11
  • 2
    You should be using a `list`. Basically everything in [this link about data frames ins lists applies to your use case for models.](https://stackoverflow.com/a/24376207/903061) – Gregor Thomas Dec 07 '17 at 02:00
  • That said, you need `get` not `assign` to make your current code work - and you don't need a temporary variable, `new_df[i,] – Gregor Thomas Dec 07 '17 at 02:03
  • Thanks a lot.`get` worked.I know I should have used list but I learned it a little too late.It takes a lot of time to run these models as the data is too large.You've been very helpful.Thanks again! – Mohamad Sahil Dec 07 '17 at 02:11

0 Answers0