Give factors numerical value [R]

Question

I want to predict a numerical variable. I have a couple of factors. For all that factors I have a numerical equivalent. Now it would be perfect to assign that numerical equivalent to the factor and use it in the prediction. Is this possible? If this is not possible I guess I will need to replace the factors with their numerical equivalent. What is the best way to do so?

An Example:

df = data.frame(f=c("a","b","a","c"),v=c(2,4,2,6))
lookup = data.frame(name=c("a","b","c"),v=c(1,2,3))

What I would like to get

df2 = data.frame(f=c(1,2,1,3),v=c(2,4,2,6))
cor(df2$f,df2$v) # will be 1

How do you mean, the factors have numerical equivalents? Factors are categories. When you say prediction, what do you mean? — TARehman, Jul 15 '14 at 16:11
R treats factors as categorical variables and numeric values as continuous variables. The two types of variables often have different statistical methods associated with them and the interpretation of a model differs by variable type. You really should decide what type of analysis is appropriate for your data first. — MrFlick, Jul 15 '14 at 16:14
I added an example to make it more clear. The letters are what I got, the numbers in the lookup-table some average values I calculated before and would like to use now. — nik, Jul 15 '14 at 16:43

score 1 · Accepted Answer · answered Jul 15 '14 at 21:53

1

Or

df2 <- merge(df, lookup, by.x = "f", by.y = "name")
cor(df2[, 2], df2[, 3])

Or if your data sets are big

library(data.table)
setkey(setDT(df), f)
setkey(setDT(lookup), name)
df2 <- df[lookup]
cor(df2[, 2, with = F], df2[, 3, with = F])

answered Jul 15 '14 at 21:53

David Arenburg

87,271
15
123
181

score 0 · Answer 2 · answered Jul 15 '14 at 21:23

0

Does this help?

cor(lookup$v[match(df$f,lookup$name)],df$v)

answered Jul 15 '14 at 21:23

Jörg Mäder

677
4
11

thanks, that works as well, but only if there is only one column needed for identification. I need more (even that is was not included in my example) – nik Jul 16 '14 at 08:45

Give factors numerical value [R]

2 Answers2