2

So I have a question of utilizing quadratic (second order) predictors with GLMs in R. Basically I have three predictor variables (x, y, z) and a response variable (let's call it ozone). X, Y, and Z are not pquadratic predictors yet so I square them X2<- x^2 (same for y and z)

Now I understand that if I wanted to model ozone based off of these predictor variables I would use the poly() or polym() function

However, when it comes to using interaction terms between these three variables...that's where I get lost. For example, if i wanted to model the interaction between the quadratic predictors of X and Y I believe I would be typing in something like this

ozone<- x+ x2 + y+y2+ x*y +x2*y + x*y2 + x2*y2 + x*y (I hope this is right)

My question is, is there an easier way of inputting this (with three terms that's a lot of typing). My other question is why does the quadratic predictor flip signs in the coefficients? When I just run the predictor variable X the coefficient is positive but when I use a quadratic predictor the coefficient almost always ends up being negative.

Leo Ohyama
  • 725
  • 7
  • 21
  • `(poly(x, 2) + poly(y, 2))^2` or `(poly(x, 2)*poly(y, 2))` – user20650 Nov 12 '16 at 18:18
  • It might be worthwhile to think a bit on exactly what you mean by "interaction" here. – Hong Ooi Nov 12 '16 at 18:19
  • am i thinking of interactions wrong? – Leo Ohyama Nov 12 '16 at 18:20
  • You have continuous predictor variables. Typically when talking about interactions, at least one predictor is assumed to be categorical, so we can talk about different trends or slopes per category. – Hong Ooi Nov 12 '16 at 18:32
  • 5
    I don't think that's correct @HongOoi. You can have an interaction between two continuous variables. In this case, it would mean that the relationship between `x` and `Ozone` varies for different values of `y` and the relationship between `y` and `Ozone` varies for different values of `x`. – eipi10 Nov 12 '16 at 18:48
  • @eipi10 yes, that's technically correct. But typically when we want to fit that kind of relationship, we call it a _surface_ rather than an _interaction_ (eg response surfaces, smooth surfaces). OP's use of the word "interaction", as well as using polynomials and multiplying them together, suggests a possible misunderstanding. – Hong Ooi Nov 12 '16 at 19:15
  • 4
    It depends who "we" is. I, like @eipi10, would refer to the interactions between continuous predictors without changing terminologies. I'd suggest that the OP might be better off with `poly(x,y,z,degree=n)` ... where `n` might be up to 6 (!). But what model the OP *should* fit is heading into CrossValidated territory ... – Ben Bolker Nov 12 '16 at 19:41
  • @BenBolker ...... as in the author behind the ecological application with R?! – Leo Ohyama Nov 12 '16 at 20:20
  • Yes, *that* @BenBolker. – eipi10 Nov 12 '16 at 20:53
  • I would suggest that @user20650 should post their solution as an answer, and a follow-up question about whether this is the a good *statistical* solution could be asked by OP on [CrossValided](http://stats.stackexchange.com) (with a link to this question) ... (hopefully with a little more context about the actual scientific problem) – Ben Bolker Nov 12 '16 at 21:53

0 Answers0