I create random data in R like that:
data<-matrix(rnorm(100*5,mean=0,sd=1), 100, 5)
colnames(data) <- c("X1", "X2", "X3", "X4", "X5")
data <- as.data.frame(data)
a <- 5
b <- 0.8
c <- 100
and then i want to "play" with the correlations of those data and do something like the following
data[,2] <- a*data[,1] - b*rnorm(c)
data[,3] <- a*data[,1] + b*rnorm(c)
data[,4] <- a*data[,1] - b*rnorm(c)
after that i perform the following code
data<-matrix(rnorm(100*5,mean=0,sd=1), 100, 5)
colnames(data) <- c("X1", "X2", "X3", "X4", "X5")
data <- as.data.frame(data)
a <- 5
b <- 0.8
c <- 100
data[,2] <- a*data[,1] - b*rnorm(c)
data[,3] <- a*data[,1] + b*rnorm(c)
data[,4] <- a*data[,1] - b*rnorm(c)
library(glmnet)
library(coefplot)
A <- as.matrix(data)
set.seed(1)
results <- lapply(seq_len(ncol(A)), function(i) {
list(
cvfit = cv.glmnet(A[, -i] , A[, i] , standardize = TRUE , type.measure = "mse" , nfolds = 10 , alpha = 1)
)
})
lam <- as.data.frame(`names<-`(
lapply(results, function(x) (x$cvfit$lambda.min)),
paste0("X", seq_along(results))
))
sigma<- matrix(rnorm(1*5,mean=0,sd=1), 1, 5)
colnames(sigma) <- c("X1", "X2", "X3", "X4", "X5")
as.vector(sigma)
sub1.sigma <- subset(sigma, select = sigma <= sum(lam))
sub2.sigma <- subset(sigma, select = sigma <= 2*sum(lam))
sub3.sigma <- subset(sigma, select = sigma <= 3*sum(lam))
which results in a vector 1x5 called sigma and 3 vectors sub1.sigma, sub2.sigma, sub3.sigma like the following
> sigma
X1 X2 X3 X4 X5
38.64019 624.4896 0 0 0
> sub1.sigma
X1 X3 X4 X5
1 38.64019 0 0 0
> sub2.sigma
X1 X3 X4 X5
1 38.64019 0 0 0
> sub3.sigma
X1 X3 X4 X5
1 38.64019 0 0 0
The generated data are random and i usually use a set.seed()
to produce the same results. I want, if it's possible without modify the main code, to run my code 100 times (with different data each time) and save in 4 dataframes the correspanding results sigma
sub1.sigma
, sub2.sigma
, sub3.sigma
in order to compare the them. Is there any way to achieve that in R?
Based on comments i manage to create the following but still doesn't seem to give the desired results. FIrst of all code[1:10] display 10 vectors which represent what? the sigma? are those the sigma of each run? how can i make it calculate the sub.sigma also?
set.seed(2021)
code <- replicate(10,{
data<-matrix(rnorm(100*5,mean=0,sd=1), 100, 5)
colnames(data) <- c("X1", "X2", "X3", "X4", "X5")
data <- as.data.frame(data)
a <- 5
b <- 0.8
c <- 100
data[,2] <- a*data[,1] - b*rnorm(c)
data[,3] <- a*data[,1] + b*rnorm(c)
data[,4] <- a*data[,1] - b*rnorm(c)
library(glmnet)
library(coefplot)
A <- as.matrix(data)
set.seed(1)
results <- lapply(seq_len(ncol(A)), function(i) {
list(
cvfit = cv.glmnet(A[, -i] , A[, i] , standardize = TRUE , type.measure = "mse" , nfolds = 10 , alpha = 1)
)
})
lam <- as.data.frame(`names<-`(
lapply(results, function(x) (x$cvfit$lambda.min)),
paste0("X", seq_along(results))
))
sigma<- matrix(rnorm(1*5,mean=0,sd=1), 1, 5)
colnames(sigma) <- c("X1", "X2", "X3", "X4", "X5")
as.vector(sigma)
sub1.sigma <- subset(sigma, select = sigma <= sum(lam))
sub2.sigma <- subset(sigma, select = sigma <= 2*sum(lam))
sub3.sigma <- subset(sigma, select = sigma <= 3*sum(lam))
}, simplify = FALSE)
code[1:10]
sigmas <- as.data.frame(do.call(rbind,lapply(code, sigma)))