0

So I have an original plot which looks like this enter image description here

I have successfully found a transformed regression model for this data set and the last thing I want to do is include the new model's line on this original data set, only I am not quite sure how to do that. The transformation I used was taking the sqrt of y and squaring x. I know somehow I am supposed to invert the transformations to apply them, but I am not quiet sure how to do it with code in ggplot as I have typically just used geom_smooth to automatically create lines on my regression models in the past.

  • Does this answer your question? [Fitting a quadratic curve in ggplot](https://stackoverflow.com/questions/42764028/fitting-a-quadratic-curve-in-ggplot) – CzechInk Sep 29 '20 at 02:03

1 Answers1

0

You can specify the method & formula in geom_smooth():

library(ggplot2)
library(dplyr)

df <- data.frame(x = 1:10,
                 y = (1:10)^2 + rnorm(10))

ggplot(data = df, aes(y = y, x = x)) +
  geom_point() +
  geom_smooth(method = "lm", 
              formula = y ~ x + I(x^2),
              se = FALSE)

For more complex custom functions it can be easier to generate predictions and use geom_line() to connect the dots:

# fit the model
mod <- lm(y ~ x + I(x^2), data = df)

# generate predictions
preds <- data.frame(x = seq(min(df$x), max(df$x), length.out = 50))
preds$y_hat <- predict(mod, newdata = preds)

# plot it
ggplot() +
  geom_point(data = df, aes(y = y, x = x)) +
  geom_line(data = preds, aes(y = y_hat, x = x), color = "red")

The model you suggest in the comments contains the response (Distance or in this example y) on both the LHS and RHS of the equation, which complicates this approach. Using lm() this way means that for any value of x, the prediction will change based on the value of y. One option you could consider is to code the value of the predictions (y_hat below) as a color with geom_tile(). You could do this as:

# your model you mentioned in the comments
mod <- lm(y ~ x + I(x^2) + I(sqrt(y)), data = df)

# get predictions
preds <- expand.grid(seq(min(df$x), max(df$x), length.out = 50),
                     seq(min(df$y), max(df$y), length.out = 50))
preds <- as.data.frame(preds)
names(preds) <- c("x", "y")
preds$y_hat <- predict(mod, newdata = preds)

# plot it
ggplot() +
  geom_point(data = df, aes(y = y, x = x)) +
  geom_tile(data = preds, aes(y = y, x = x, fill = y_hat), alpha = .7) +
  theme_bw()

enter image description here

CzechInk
  • 354
  • 1
  • 10