Questions tagged [ecdf]

Empirical Cumulative Distribution Function in statistics

For definition please see its Wikipedia page.

In software, a built-in function ecdf takes a vector of samples and generates its ECDF. It is also easy to produce it ourselves, as given in this example: How to derive an ecdf function?

131 questions
1
vote
1 answer

How to find the multivariate empirical cumulative distribution function (CDF) in R?

I have two correlated variables x and y, and I wonder how to find their empirical joint CDF in R? Also, how can we find probabilities like: P(X<=2 and Y<=3), P(X>=2 and Y>=3), P(X>=3 and Y<=2), P(X<=3 and Y>=2); P(X<=2 or Y<=3), P(X>=3 or Y>=2),…
Yang Yang
  • 971
  • 3
  • 17
  • 41
1
vote
0 answers

R memory puzzle on ECDF environments

I have a massive list of ECDF objects. Similar to: vals <- rnorm(10000) x <- ecdf(vals) ecdfList <- lapply(1:10000, function(i) ecdf(vals)) save(ecdfList, file='mylist.rda') class(ecdfList[[1]]) [1] "ecdf" "stepfun" "function" Let's quit the…
dave gibbs
  • 41
  • 4
1
vote
0 answers

Even display of unevenly spaced numbers on x/y coordinates

Would you advise on how I could make an even display of unevenly spaced number on a graph. For example, considering the code below : BREAKS = c(0, 0.1, 1, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100, 200, 300, 400, 500) a <- seq(0,100,0.1) b <-…
Bogdan
  • 113
  • 8
1
vote
1 answer

R - calculate probability and flip x/y axis of cumulative curve (ECDF)

In R I plot a cumulative curve using the ecdf function to show area vs. elevation. By default the elevation is plotted on the x axis, the area on the y axis, where elevation is given in total values (eg. 1000-3000m) and the area in probability…
the_chimp
  • 165
  • 1
  • 15
1
vote
0 answers

Matlab `quantile` doesn't interpolate between sample values on ECDF?

According to Matlab's help, quantile interpolates linearly between points on the empirical cumulative distribution function (ECDF). Importantly, the points interpolated between are the mid-points of the risers at each step. I'm finding the actual…
user36800
  • 1,649
  • 1
  • 12
  • 22
1
vote
1 answer

R - Object not found error when using ddply

I'm applying ddply to the following data frame. The point is to apply ecdf function to yearly_test_count value to rows that have the same country. > head(test) country yearly_test_count download_speed 1 AU 1 2.736704 2 …
Tara Sutjarittham
  • 347
  • 1
  • 5
  • 17
1
vote
1 answer

Plot several graphs using ggplot() and facet_grid()

I am wondering in how to plot several graphs in one screen using ggplot() and facet_grid() because I really need to repeat this process several times for different statistical variables. I have two data frame contenting observations and another one…
FernRay
  • 77
  • 7
1
vote
1 answer

In ggplot, adding legend labels to manual color scale causes two legends to appear

Using the code below I can generate the graph I want, but when I try to change the legend labels in the scale_color_manual section, a second legend appears for only the linetype variable Original code: set.seed(124) DF <-…
traggatmot
  • 1,201
  • 3
  • 22
  • 42
1
vote
1 answer

Building an empirical cumulative distribution function and data interpolation in R

Here's an example data frame I'm working with level Income cumpop 1 17995.50 0.028405 2 20994.75 0.065550 3 29992.50 0.876185 4 41989.50 2.364170 5 53986.50 4.267305 6 65983.50 6.323390 7 …
1
vote
1 answer

How do I vectorize the ecdf function in R?

I have a data frame that looks like this: set.seed(42) data <- runif(1000) utility <- sample(c("abc","bcd","cde","def"),1000,replace=TRUE) stage <- sample(c("vwx","wxy","xyz"),1000,replace=TRUE) x <- data.frame(data,utility,stage) head(x) …
Jonathan
  • 417
  • 4
  • 15
1
vote
1 answer

Interpolation between ECDF curves

I have 6 curves which describes the ECDF of number of tickets bought at a fix value. Now I want to interpolate to make curves between them, but following the next formula. For example to estimate the ECDF at a prices of 10k, should be guide by…
1
vote
1 answer

Filling cross over under a Cumulative Frequency plot using ggplot in R

I am trying to plot two Cumulative Frequency curves in ggplot, and shade the cross over at a certain cut off. I haven't been using ggplot for long, so I was hoping someone might be able to help me with this one. The plot without filled regions,…
Kate2808
  • 119
  • 2
  • 7
1
vote
0 answers

Matlab survival curve log rank test

I am trying to do survival analysis in matlab and want to calculate log rank test scores among several curves. I found a possible code to do log rank here. But based on its description, it can only do log rank test between two groups. What if I have…
lolibility
  • 2,117
  • 4
  • 21
  • 39
1
vote
1 answer

Add a grid to an "ecdfplot" in R

I am using the latticeExtra library "ecdfplot" to plot my error. I want to add gridlines. The following does not seem to work: ecdfplot(err) grid(ny=10) It gives the following (gridless) result: I really would love to give a "graphical summary"…
EngrStudent
  • 1,633
  • 24
  • 39
1
vote
1 answer

Linear histogram matching of two rasters (Landsat slc-off images) in R

I have two rasters (Landsat slc-off images) in R. Both are missing some data, but the gap locations are completely offset. As an example, I create two rasters r1 and r2 below. r1 <- raster(system.file("external/test.grd", package="raster")) r1_mat…
shekeine
  • 1,395
  • 8
  • 21
1 2 3
8 9