0

I am having trouble producing a figure in R using ggplots. No stats are needed - I just need the visual representation of my data. I have 7 participants, and I want to plot a line for each participant through a scatterplot. The slope and shape of the line is different for each participant, however on average is somewhat exponential.

I have used the below code in R, however I am only getting liner models. When changing the method to loess, the lines are too wriggly. Can someone please help me make this more presentable? Essentially I'm after a line of best fit for each participant, yet still need to be able to use the function fullrange = FALSE.

Furthermore, should I be using stat_smooth or geom_smooth? Is there a difference.

ggplot(data, aes(x=x, y=y, group = athlete)) +
  geom_point() + 
  stat_smooth(method = "lm", se=FALSE, fullrange = FALSE)

ggplot for data

Thanks in advance for any help!

Rui Barradas
  • 44,483
  • 8
  • 22
  • 48
  • 1
    Welcome to SO ! Please include a [reproducible example](https://stackoverflow.com/q/5963269/6478701) with your question so people can help you more easily. – RoB Nov 15 '19 at 09:06
  • Can you post sample data? Please edit **the question** with the output of `dput(data)`. Or, if it is too big with the output of `dput(head(data, 30))`. – Rui Barradas Nov 15 '19 at 09:10
  • See if [this SO question](https://stackoverflow.com/questions/37329074/geom-smooth-and-exponential-fits) helps. Note the use of the `formula` argument. – Rui Barradas Nov 15 '19 at 09:15

2 Answers2

1

I don't have your data, so I'll just do this with the mpg dataset.

As you've noted, you can use geom_smooth() and specify a method such as "loess". Know that you can pass on arguments to the methods as you would if you were using the function behind it.

With loess, the smoothing parameter is span. You can play around with this until you're happy with the results.

data(mpg)
g <- ggplot(mpg, aes(x = displ, y = hwy, color = class)) +  geom_point()

g + geom_smooth(se = F, method = 'loess', span = .8) + ggtitle("span 0.8")
g + geom_smooth(se = F, method = 'loess', span = 1) + ggtitle("span 1")

enter image description here enter image description here

RoB
  • 1,589
  • 7
  • 18
1

There is, to my knowledge, no built-in method for achieving this, but you can do it with some manual plotting. First, since you expect an exponential relationship, it might make sense to run a linear regression using log(y) as the response (I'll be using u and v, in order not to confuse them with the x and y aesthetics in the graph):

tb1 = tibble(
  u = rep(runif(100, 0, 5), 3),
  a = c(rep(-.5, 100), rep(-1, 100), rep(-2, 100)),
  v = exp(a*u + rnorm(3*100, 0, .1))
) %>% mutate(a = as.factor(a))
lm1 = lm(log(v) ~ a:u, tb1)
summary(lm1)

gives you:

Call:
lm(formula = log(v) ~ a:u, data = tb1)

Residuals:
      Min        1Q    Median        3Q       Max 
-0.263057 -0.069510 -0.001262  0.062407  0.301033 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) -0.013696   0.012234   -1.12    0.264    
a-2:u       -1.996670   0.004979 -401.04   <2e-16 ***
a-1:u       -1.001412   0.004979 -201.14   <2e-16 ***
a-0.5:u     -0.495636   0.004979  -99.55   <2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Residual standard error: 0.1002 on 296 degrees of freedom
Multiple R-squared:  0.9984,    Adjusted R-squared:  0.9983 
F-statistic: 6.025e+04 on 3 and 296 DF,  p-value: < 2.2e-16

Under "Coefficients" you can find the intercept and the "slopes" for the curves (actually the exponential factors). You can see that they closely match the factors we used for generating the data.

To plot the fitting curves, you can use the "predicted" values, produced from your linear model using predict:

ggplot(tb1, aes(u, v, colour=a)) +
  geom_point() +
  geom_line(data=tb1 %>% mutate(v = exp(predict(lm1))))

If you want to have the standard error ribbons, it's a little more work, but still possible:

p1 = predict(lm1, se.fit=T)
tb2 = tibble(
  u = tb1$u,
  a = tb1$a,
  v = exp(p1$fit),
  vmin = exp(p1$fit - 1.96*p1$se.fit),
  vmax = exp(p1$fit + 1.96*p1$se.fit)
)
ggplot(tb2, aes(u, v, colour=a)) +
  geom_ribbon(aes(fill=a, ymin=vmin, ymax=vmax), colour=NA, alpha=.25) +
  geom_line(size=.5) +
  geom_point(data=tb1)

produces:

smoothed exp regression

Igor F.
  • 2,434
  • 2
  • 28
  • 37
  • Thanks Igor, this is exactly what I was after. A final question, and idea now how to get the intercept and slope for each of these curves? – Alannah McKay Nov 18 '19 at 07:04
  • @Alannah: I updated the answer and corrected a misconception in it. Hope it helps. If it does, please consider accepting the answer. – Igor F. Nov 18 '19 at 09:16