9

I'm using R 3.3.1 (64-bit) on Windows 10. I have an x-y dataset that I've fit with a 2nd order polynomial. I'd like to solve that best-fit polynomial for x at y=4, and plot drop-down lines from y=4 to the x-axis.

This will generate the data in a dataframe v1:

v1 <- structure(list(x = c(-5.2549, -3.4893, -3.5909, -2.5546, -3.7247, 
-5.1733, -3.3451, -2.8993, -2.6835, -3.9495, -4.9649, -2.8438, 
-4.6926, -3.4768, -3.1221, -4.8175, -4.5641, -3.549, -3.08, -2.4153, 
-2.9882, -3.4045, -4.6394, -3.3404, -2.6728, -3.3517, -2.6098, 
-3.7733, -4.051, -2.9385, -4.5024, -4.59, -4.5617, -4.0658, -2.4986, 
-3.7559, -4.245, -4.8045, -4.6615, -4.0696, -4.6638, -4.6505, 
-3.7978, -4.5649, -5.7669, -4.519, -3.8561, -3.779, -3.0549, 
-3.1241, -2.1423, -3.2759, -4.224, -4.028, -3.3412, -2.8832, 
-3.3866, -0.1852, -3.3763, -4.317, -5.3607, -3.3398, -1.9087, 
-4.431, -3.7535, -3.2545, -0.806, -3.1419, -3.7269, -3.4853, 
-4.3129, -2.8891, -3.0572, -5.3309, -2.5837, -4.1128, -4.6631, 
-3.4695, -4.1045, -7.064, -5.1681, -6.4866, -2.7522, -4.6305, 
-4.2957, -3.7552, -4.9482, -5.6452, -6.0302, -5.3244, -3.9819, 
-3.8123, -5.3085, -5.6096, -6.4557), y = c(0.99, 0.56, 0.43, 
2.31, 0.31, 0.59, 0.62, 1.65, 2.12, 0.1, 0.24, 1.68, 0.09, 0.59, 
1.23, 0.4, 0.36, 0.49, 1.41, 3.29, 1.22, 0.56, 0.1, 0.67, 2.38, 
0.43, 1.56, 0.07, 0.08, 1.53, -0.01, 0.12, 0.1, 0.04, 3.42, 0.23, 
0, 0.34, 0.15, 0.03, 0.19, 0.17, 0.2, 0.09, 2.3, 0.07, 0.15, 
0.18, 1.07, 1.21, 3.4, 0.8, -0.04, 0.02, 0.74, 1.59, 0.71, 10.64, 
0.64, -0.01, 1.06, 0.81, 4.58, 0.01, 0.14, 0.59, 7.35, 0.63, 
0.17, 0.38, -0.08, 1.1, 0.89, 0.94, 1.52, 0.01, 0.1, 0.38, 0.02, 
7.76, 0.72, 4.1, 1.36, 0.13, -0.02, 0.13, 0.42, 1.49, 2.64, 1.01, 
0.08, 0.22, 1.01, 1.53, 4.39)), .Names = c("x", "y"), class = "data.frame", row.names = c(NA, 
-95L))

Here's the code to plot y vs x, plot the best fit polynomial, and draw a line at y=4.

> attach(v1)
> # simple x-y plot of the data
> plot(x,y, pch=16)
> # 2nd order polynomial fit
> fit2 <- lm(y~poly(x,2,raw=TRUE))
> summary(fit2)
> # generate range of numbers for plotting polynomial
> xx <- seq(-8,0, length=50)
> # overlay best fit polynomial
>lines(xx, predict(fit2, data.frame(x=xx)), col="blue")
> # add horizontal line at y=4
> abline(h=4, col="red")
>

It's obvious from the plot that y=4 at x of around -2 and -6.5, but I'd like to actually solve the regression polynomial for those values.

Ideally, I'd like lines that drop down from the red-blue line intersections to the x-axis (i.e plot vertical ablines that terminate at the two y=4 solutions). If that's not possible, I'd be happy with good old vertical ablines that go all the way up the plot, so long as they at the proper x solution values.

This graph represents parts that will be out-of-spec when y>4, so I want to use the drop-down lines to highlight the range of x values that will produce in-spec parts.

zx8754
  • 42,109
  • 10
  • 93
  • 154
jeffgoblue
  • 227
  • 1
  • 2
  • 9

5 Answers5

10

You can use the quadratic formula to calculate the values:

betas <- coef(fit2)    # get coefficients
betas[1] <- betas[1] - 4    # adjust intercept to look for values where y = 4

# note degree increases, so betas[1] is c, etc.
betas
##             (Intercept) poly(x, 2, raw = TRUE)1 poly(x, 2, raw = TRUE)2 
##               8.7555833               6.0807302               0.7319848 

solns <- c((-betas[2] + sqrt(betas[2]^2 - 4 * betas[3] * betas[1])) / (2 * betas[3]), 
           (-betas[2] - sqrt(betas[2]^2 - 4 * betas[3] * betas[1])) / (2 * betas[3]))

solns
## poly(x, 2, raw = TRUE)1 poly(x, 2, raw = TRUE)1 
##               -1.853398               -6.453783 

segments(solns, -1, solns, 4, col = 'green')    # add segments to graph

plot with segments

Much simpler (if you can find it) is polyroot:

polyroot(betas)
## [1] -1.853398+0i -6.453783+0i

Since it returns a complex vector, you'll need to wrap it in as.numeric if you want to pass it to segments.

alistaire
  • 38,696
  • 4
  • 60
  • 94
8

I absolutely understand that there is an analytical solution for this simple quadratic polynomial. The reason I show you numerical solution is that you ask this question in regression setting. Numerical solution may always be your solution in general, when you have more complicated regression curve.

In the following I will use uniroot function. If you are not familiar with it, read this short answer first: Uniroot solution in R.


enter image description here

This is the plot produced with your code. You are almost there. This is a root finding problem, and you may numerically use uniroot. Let's define a function:

f <- function (x) {
  ## subtract 4
  predict(fit2, newdata = data.frame(x = x)) - 4
  }

From the figure, it is clear that there are two roots, one inside [-7, -6], the other inside [-3, -1]. We use uniroot to find both:

x1 <- uniroot(f, c(-7, -6))$root
#[1] -6.453769

x2 <- uniroot(f, c(-3, -1))$root
#[1] -1.853406

Now you can drop a vertical line from these points down to x-axis:

y1 <- f(x1) + 4  ## add 4 back
y2 <- f(x2) + 4  

abline(h = 0, col = 4)  ## x-axis
segments(x1, 0, x1, y1, lty = 2)
segments(x2, 0, x2, y2, lty = 2)

enter image description here

Community
  • 1
  • 1
李哲源
  • 59,090
  • 15
  • 146
  • 206
6

You have a quadratic equation

0.73198 * x^2 + 6.08073 * x + 12.75558 = 4
OR
0.73198 * x^2 + 6.08073 * x + 8.75558 = 0

You can just use the quadratic formula to solve this analytically. R gives the two roots:

(-6.08073 + sqrt(6.08073^2 -4*0.73198 * 8.75558)) / (2 * 0.73198)
[1] -1.853392
(-6.08073 - sqrt(6.08073^2 -4*0.73198 * 8.75558)) / (2 * 0.73198)
[1] -6.453843

abline(v=c(-1.853392, -6.453843))

Image with drop down lines

G5W
  • 32,266
  • 10
  • 31
  • 60
4

Here is one more solution, based on this

attach(v1)
fit2 = lm(y~poly(x,2,raw=TRUE))
xx = seq(-8,0, length=50)

vector1 = predict(fit2, data.frame(x=xx)) 
vector2= replicate(length(vector1),4)

# Find points where vector1 is above vector2.
above = vector1 > vector2

# Points always intersect when above=TRUE, then FALSE or reverse
intersect.points = which(diff(above)!=0)    

# Find the slopes for each line segment.
vector1.slopes = vector1[intersect.points+1] - vector1[intersect.points]
vector2.slopes = vector2[intersect.points+1] - vector2[intersect.points]

# Find the intersection for each segment.
x.points = intersect.points + ((vector2[intersect.points] - vector1[intersect.points]) / (vector1.slopes-vector2.slopes))
y.points = vector1[intersect.points] + (vector1.slopes*(x.points-intersect.points))

#Scale x.points to the axis value of xx
x.points = xx[1] + ((x.points - 1)/(49))*(xx[50]-xx[1])

plot(xx, y = vector1, type= "l", col = "blue")
points(x,y,pch = 20)
lines(x = c(x.points[1],x.points[1]), y = c(0,y.points[1]), col='red')
lines(x = c(x.points[2],x.points[2]), y = c(0,y.points[2]), col='red')

enter image description here

Community
  • 1
  • 1
d.b
  • 29,772
  • 5
  • 24
  • 63
4

Many solutions are already proposed, here is another one.

As obvious, we are interested to find the x values that satisfy the polynomial (quadratic) equation a_0 + a_1.x + a_2.x^2 = 4, where a_0, a_1, a_2 are the coefficients of the fitted polynomial. We can rewrite the equation as a standard quadratic equation ax^2+bx+c=0 and find the roots using Sridhar's formula using the coefficients of the fitted polynomial with polynomial regression as follows:

enter image description here

a <- fit2$coefficients[3]
b <- fit2$coefficients[2]
c <- fit2$coefficients[1] - 4

as.numeric((-b + sqrt(b^2-4*a*c)) / (2*a))
#[1] -1.853398
as.numeric((-b-+ sqrt(b^2-4*a*c)) / (2*a))
#[1] -6.453783

We can use some numerical methods such as Newton-Raphson to find the roots as well (although there are faster numerical methods but this will solve our purpose and it's quite fast too, takes ~160 ms on my machine), as we can see from the following code, the numerical and the theoretical solutions agree.

a <- fit2$coefficients  # fitted quadratic polynomial coefficients

f <- function(x) {
  as.numeric(a[1] + a[2]*x + a[3]*x^2-4)
}

df <- function(x) {
  as.numeric(a[2] + 2*a[3]*x)
} 

Newton.Raphson <- function(x0) {
  eps <- 1e-6
  x <- x0
  while(TRUE) {
    x <- x0 - f(x0) / df(x0)
    if (abs(x - x0) < eps) {
      return(x0)
    }
    x0 <- x
  }
}

t1 <- Sys.time()
x1 <- Newton.Raphson(-10)
x2 <- Newton.Raphson(10)
x1
#[1] -6.453783
x2
#[1] -1.853398
s2
print(paste('time taken to compute the roots:' ,Sys.time() - t1))
#[1] "time taken to compute the roots: 0.0160109996795654"
points(x1, 4, pch=19, col='green')
points(x2, 4, pch=19, col='green')
abline(v=x1, col='green')
abline(v=x2, col='green')

enter image description here

Sandipan Dey
  • 17,896
  • 2
  • 31
  • 42