An data.table example:
require("data.table")
variable_importance <- data.frame(Overall=c(87.30483,88.59212,34.16171,35.72880,50.62831,44.76673,31.12285,43.04628,33.01750,30.72718), row.names=paste0("x.",1:10))
variable_importance # show data.frame
dt <- as.data.table(variable_importance, keep.rownames=T) # new data.table, by value (copy)
#dt <- setDT(variable_importance, keep.rownames=T) # new data.table, by reference (so variable_importance is now the same data.table, too)
setorder(dt, -Overall) # order data.table reverse by column Overall
setnames(dt, "rn", "") # delete colname "rn"
dt # show data.table
setDT
promotes variable_importance
, which is much faster on huge data sets.
When you transform the data.frame to a data.table you have to specify keep.rownames=T
and you get a new column called rn
with the original rownames
, as data.table automaticly numbers the rows.
Normly, when workign with data.table
, you should not asign empty column names as you work with them. It is better practice to make a new column called id
.
setnames(dt, "", "rn") # give column back it's name to work with it
dt[,id:=as.integer(substr(rn, start=3, stop=nchar(rn)))] # extract numbers from rownames
dt[,rn:=NULL] # delete column rn
setcolorder(dt, c("id","Overall")) # reorder columns
dt # show data.table