-1

I have a dataset naked named data in R with n variables, namely X1,X2,...,Xn. At some point of my code i find a subset named subs from another procedure, that gives the following output

        X1       X7       X8       X9      X11     
1    440.8065 466.9053 60.03588 374.8059 167.2424  

Note that Xi values are not the same values from data and each Xi is the header, not a row in R. For each of those Xi there are saved in the environment dataframes named VarXi (e.g. VarX1, VarX2,..., VarXn) of this form

      variable   coefficient
1  (Intercept) -2.111150e+03
2           X3  2.797371e-05
3           X5  5.653977e-01
4           X6  5.660470e+00
5           X7  1.003460e+01
6           X8  2.403519e+01
7          X10  3.931899e-01
8          X12  2.062661e+00
9          X13  5.430814e+00
10         X14  2.433546e-01

I want firstly to create a new dataset newdata that contains only the variables displayed in subs, and secondly (and most importantly) for only those VarXi displayed in subs to print only the the abs(maximum coefficient) and the corresponding variable of first column, eg.g based on above example

  variable   coefficient

1      X6    5.660470

How can i do those things in R?

nickolakis
  • 131
  • 6
  • *"there are saved in the environment dataframes named `sVarXi` (e.g. `sVarX1`, `sVarX2`,..., `sVarXn`) of this form"* This is pretty terrible. Your life will be easier if you use a list of data frames instead. [See my answer here](https://stackoverflow.com/questions/17499013/how-do-i-make-a-list-of-data-frames/24376207#24376207) for details and examples. – Gregor Thomas Oct 08 '20 at 16:59
  • thanks for your contribution, i had them in a list but in the end I wanted some of them separately from the others. – nickolakis Oct 08 '20 at 17:02
  • unlikely to be useful to anyone in the future – IRTFM Oct 09 '20 at 01:55

1 Answers1

0

Without a reproducible example it's hard to tell, but this is my best guess for you:

# get a list of the "sVarXi" data frames:
svar_list = mget(ls(pattern = "svarX[0-9]+"))

# narrow down to just the ones in subs
svar_list_in_subs = svar_list[paste0("sVar", names(subs))]

# extract maximum row
max_coeffs = lapply(svar_list_in_subs, function(x) {
  x = x[x$variable != "(Intercept)", ]
  x[which.max(abs(x$coefficient)), ]
})

I'm saving svar_list_in_subs separately here just so the steps are clear - but it is duplicative and unnecessary. Once you have the sVar list, you could jump straight to:

max_coeffs_in_subs = lapply(svar_list[paste0("sVar", names(subs))], function(x) {
  x = x[x$variable != "(Intercept)", ]
  x[which.max(abs(x$coefficient)), ]
})
Gregor Thomas
  • 104,719
  • 16
  • 140
  • 257
  • It displayed the following `Error in abs(x$coefficient) : non-numeric argument to mathematical function` – nickolakis Oct 08 '20 at 17:20
  • That could be for a few reasons - you would get that error if you tried to take the absolute value of a factor column or a non-existent column. Perhaps one of your `coefficient` columns isn't numeric? Or [perhaps the `mget` picked up a data frames that don't have the `coefficient` column? Or perhaps `names(subs)` includes the name of a data frame that doesn't exist or didn't make it into `svar_list`? Hmmm, maybe we need to paste "sVar" on to the names of `subs`? I'll make that edit, but if you need more debugging help I can't do much without a reproducible example. – Gregor Thomas Oct 08 '20 at 17:34
  • You should also be able to check the intermediate results and make sure they look right. Like, if `svar_list` has stuff, but `svar_list_in_subs` doesn't, then that's where the problem is. – Gregor Thomas Oct 08 '20 at 17:35
  • I think i locate what the problem might be, in all cases `max_coeffs` return the `(Intercept)` as the result, how can i make it ignore it in the calculations? – nickolakis Oct 08 '20 at 17:45
  • Sorry, can not run it because of too much syntax error – nickolakis Oct 08 '20 at 17:59
  • Sorry, can not debug because no reproducible example to test on. – Gregor Thomas Oct 08 '20 at 18:02
  • Okay, I found missing `]` and put it in, but seriously it's hard to debug code that I can't run because there is no input. If you need more help, share a reproducible example. – Gregor Thomas Oct 08 '20 at 18:04
  • We might getting closer to the solution. When i ran `svar_list`, everything is good. When i run the `svar_list_in_subs` i nstead of having a list of 6 `svarX`, i have six empty lists. Is there any other command instead of paste0 to make it display lists for only `subs` displayed variables? – nickolakis Oct 08 '20 at 18:24
  • What is `names(subs)` and what is `names(svar_list)`? Adding `dput(subs)` to your question would help a lot. – Gregor Thomas Oct 08 '20 at 18:32
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/222735/discussion-between-nickolakis-and-gregor-thomas). – nickolakis Oct 08 '20 at 18:39
  • So, the problem is that `ssVarX` doesn't have names. If you created it with `mget` like I showed, it should have the names of the objects that went into it, and that we can build out of the column names of `subs`. But it doesn't have any names. If you created it manually as `ssVarX = list(ssVarX1, ssVarX2, ...)` then go back and give it names, `ssVarX = list(ssVarX1 = ssVarX1, ssVarX2 = ssVarX2, ...)`, or use `mget` or some other method so that it has names. As long as you can build the names of `ssVarX` out of the column names of `subs`, the rest should work. – Gregor Thomas Oct 08 '20 at 18:53
  • I created it with the one you showed first but it didn't include all variables, so i tried to use the `ssVarX` list that already existed in my code – nickolakis Oct 08 '20 at 18:55
  • You can play around with the `ls(pattern = "svarX[0-9]+")` to get the list of objects to work. Maybe you need the capital V `pattern = "sVarX[0-9]+"`. Or maybe you need "ss" instead of "s"... – Gregor Thomas Oct 08 '20 at 18:57
  • i manage to make it work before writing like that `svar_list = mget(ls(pattern = "ssVarX[0-9]+"))``, having the wanted results but `svar_list_in_subs` was giving me empty `list of 6'. Maybe subs have not names too? can that be the problem? – nickolakis Oct 08 '20 at 19:03
  • I made it work, thanks for all your help, it was a syntax error made by me. – nickolakis Oct 08 '20 at 19:22