11

Formulas are a very useful feature of R's statistical and graphical functions. Like everyone, I am a user of these functions. However, I have never written a function that takes a formula object as an argument. I was wondering if someone could help me, by either linking to a readable introduction to this side of R programming, or by giving a self-contained example.

Karolis Koncevičius
  • 7,687
  • 9
  • 48
  • 71
gappy
  • 9,677
  • 13
  • 51
  • 72

1 Answers1

7

You can use model.matrix() and model.frame() to evaluate the formula:

lm1 <- lm(log(Volume) ~ log(Girth) + log(Height), data=trees)
print(lm1)

form <- log(Volume) ~ log(Girth) + log(Height)

# use model.matrix
mm <- model.matrix(form, trees)
lm2 <- lm.fit(as.matrix(mm), log(trees[,"Volume"]))
print(coefficients(lm2))

# use model.frame, need to add intercept by hand
mf <- model.frame(form, trees)
lm3 <- lm.fit(as.matrix(data.frame("Intercept"=1, mf[,-1])), mf[,1])
print(coefficients(lm3))

which yields

Call: lm(formula = log(Volume) ~ log(Girth) + log(Height), data = trees)

Coefficients: (Intercept)   log(Girth) log(Height)
      -6.63         1.98         1.12

(Intercept)  log(Girth) log(Height)
     -6.632       1.983       1.117  
Intercept  log.Girth. log.Height.
     -6.632       1.983       1.117
Dirk Eddelbuettel
  • 331,520
  • 51
  • 596
  • 675
  • 1
    Thanks, very interesting. I understand also why glmnet or ther packages may not offer this capability: it uses sparse matrix in the package Matrix, which may not be treated with model.matrix(). – gappy Aug 19 '09 at 18:24