Find the two points on the x-axis corresponding to the intersections for a cdf

Question

The corresponding r code is given below.

theta <- seq(0,1, length = 10)
CD_theta <- function(x, p, n){
  1 - pbinom(x, size = n, prob = p) + 1 / 2 * dbinom(x, size = n, prob = p)
}

Then I graphed the data as follows :

mytheta <- CD_theta(5, theta, 10)
df <- data.frame(theta = theta, mytheta = mytheta)
ggplot(df, aes(x = theta, y = mytheta)) +
  geom_line(size = 1, col = "steelblue") +
  ylab("H(Theta)") +
  xlab("Theta")

The corresponding graph is given below.

As you can see there are two horizontal lines (graphed in red) and the two vertical lines (graphed in black). I need to find the two points on the x-axis corresponding to the intersections for a H(theta).

I used the locator() function in r to compute the two x intercepts for a single iteration. I would like iterate the above for 1000 times, so it is really tedious approach find them separately.

are there any other r functions can be used find these two x intercept points?

Thank you in advance.

bVa · Accepted Answer · 2018-09-21T13:51:09.647

Here is a numerical approach using optimize function:

library(reprex)

theta <- seq(0,1, length = 10)
CD_theta <- function(x, p, n){
  1 - pbinom(x, size = n, prob = p) + 1 / 2 * dbinom(x, size = n, prob = p)
}

# Create a function to compute the difference between the "y" you want 
# and the "y" computed with CD_theta function 
# then try to minimize the output of this new function : 
# get the theta value corresponding to this "y"

my_fn <- function(theta_loc, y, x, n) { 
  # the function to optimize
  abs(y - CD_theta(x, theta_loc, n)) # dont forget abs (absolute)
}

# Then use optimize function to compute the theta value 
# between a given interval : c(0,1) in this case
# Note that you can directly modify here the values of y, x and n
v1 <- optimize(my_fn, c(0, 1), y = 0.025, x = 5, n = 10)$`minimum` 
v2 <- optimize(my_fn, c(0, 1), y = 0.975, x = 5, n = 10)$`minimum` 

# print the results
v1 # 0.025
#> [1] 0.2120079
v2 # 0.975
#> [1] 0.7879756

Created on 2018-09-21 by the reprex package (v0.2.0).

score 1 · Answer 2 · answered Sep 21 '18 at 12:46

Increasing the discretisation of your curve a bit, this gets fairly simple:

theta <- seq(0,1, length = 100) # increase length here for more precision on point locations
CD_theta <- function(x, p, n){
  1 - pbinom(x, size = n, prob = p) + 1 / 2 * dbinom(x, size = n, prob = p)
}

mytheta <- CD_theta(5, theta, 10)
df <- data.frame(theta = theta, mytheta = mytheta)
ggplot(df, aes(x = theta, y = mytheta)) +
  geom_line(size = 1, col = "steelblue") +
  ylab("H(Theta)") +
  xlab("Theta")

points <- data.frame(x=c(theta[which.min(abs(mytheta - .975))], # find which point is the nearer
                         theta[which.min(abs(mytheta - .025))]),
                     y=c(.975,.025))

ggplot(df, aes(x = theta, y = mytheta)) +
  geom_line(size = 1, col = "steelblue") +
  ylab("H(Theta)") +
  xlab("Theta") + 
  geom_point(data=points,aes(y=y, x=x), size=5, col="red")

score 1 · Answer 3 · answered Sep 21 '18 at 13:19

If you want to find the exact Theta and HTheta values, independent of the grid size (here N = 10), apply uniroot to the CD_theta function.

CD_theta <- function(x, p, n) {
    1  -  pbinom (x, size = n, prob = p) + 
    1/2 * dbinom(x, size = n, prob = p)
}

u1 = uniroot(function(p) CD_theta(5, p, 10) - 0.025, c(0, 1))
u2 = uniroot(function(p) CD_theta(5, p, 10) - 0.975, c(0, 1))
(Theta1 = u1$root)  # 0.2119934
(Theta2 = u2$root)  # 0.7880066

But if the discretization (with N = 10) is important for you, then perform a linear interpolation on this function between grid points.

theta <- seq(0, 1, length = 10)
mytheta <- CD_theta(5, theta, 10)
f <- approxfun(theta, mytheta, method = "linear", 0.0, 1.0)

u1 = uniroot(function(p) f(p) - 0.025, c(0, 1))
u2 = uniroot(function(p) f(p) - 0.975, c(0, 1))
(Theta1 = u1$root)  # 0.2015638
(Theta2 = u2$root)  # 0.7984362

Find the two points on the x-axis corresponding to the intersections for a cdf

3 Answers3