Questions tagged [ecdf]

Empirical Cumulative Distribution Function in statistics

For definition please see its Wikipedia page.

In software, a built-in function ecdf takes a vector of samples and generates its ECDF. It is also easy to produce it ourselves, as given in this example: How to derive an ecdf function?

131 questions
10
votes
2 answers

Reliably retrieve the reverse of the quantile function

I have read other posts (such as here) on getting the "reverse" of quantile -- that is, to get the percentile that corresponds to a certain value in a series of values. However, the answers don't give me the same value as quantile for the same…
9
votes
3 answers

R: Plotting one ECDF on top of another in different colors

I have a couple of cumulative empirical density functions which I would like to plot on top of each other in order to illustrate differences in the two curves. As was pointed out in a previous question, the function to draw the ECDF is simply…
JD Long
  • 55,115
  • 51
  • 188
  • 278
7
votes
1 answer

In R ggplot2, include stat_ecdf() endpoints (0,0) and (1,1)

I'm trying to use stat_ecdf() to plot cumulative successes as a function of a rank score created by a predictive model. #libraries require(ggplot2) require(scales) # fake data for reproducibility set.seed(123) n <- 200 df <- data.frame(model_score=…
C8H10N4O2
  • 15,256
  • 6
  • 74
  • 113
7
votes
1 answer

Python Empirical distribution function (ecdf) implementation

I am aware of statsmodels.tools.tools.ECDF but since the calculation of an empricial cumulative distribution function (ECDF) is pretty straight-forward and I want to minimise dependencies in my project, I want to code it manually. In a given list()…
Zhubarb
  • 8,409
  • 17
  • 65
  • 100
6
votes
3 answers

What is the fastest way to obtain frequencies of integers in a vector?

Is there a simple and fast way to obtain the frequency of each integer that occurs in a vector of integers in R? Here are my attempts so far: x <- floor(runif(1000000)*1000) print('*** using TABLE:') system.time(as.data.frame(table(x))) print('***…
Museful
  • 5,715
  • 2
  • 36
  • 53
5
votes
3 answers

How to plot reverse (complementary) ecdf using ggplot?

I currently use stat_ecdf to plot my cumulative frequency graph. Here is the code I used cumu_plot <- ggplot(house_total_year, aes(download_speed, colour = ISP)) + stat_ecdf(size=1) However I want the ecdf to be…
Tara Sutjarittham
  • 347
  • 1
  • 5
  • 17
5
votes
4 answers

How to plot multiple ECDF's on one plot in different colors in R

I am trying to plot 4 ecdf functions on one plot but can't seem to figure out the proper syntax. If I have 4 functions "A, B, C, D" what would be the proper syntax in R to get them to be plotted on the same chart with different colors. Thanks!
Jason
  • 93
  • 1
  • 1
  • 7
5
votes
2 answers

how to specify color of lines and points in ecdf ggplot2

I have a set of data that is tough to visualize, but I think an ECDF with a couple of points and lines added to it will do the trick. I am able to plot things the way that I want; my problem is coloring things correctly. I have the following code,…
RyanStochastic
  • 3,683
  • 4
  • 14
  • 22
4
votes
2 answers

R Highlight point on ecdf line graph

I'm creating a frequency plot using ggplot and the stat_ecdf function. I would like to add the Y-value to the graph for specific X-values, but just can't figure out how. geom_point or geom_text seems likely options, but as stat_ecdf automatically…
Gerard
  • 139
  • 2
  • 11
4
votes
2 answers

How to smooth ecdf plots in r

I have a df with 5 variables, head(df,15) junc N1.ir N2.ir W1.ir W2.ir W3.ir 1 pos$chr1:3197398 0.000000 0.000000 0.000000 0.000000 0.000000 2 pos$chr1:3207049 0.000000 0.000000 0.000000 0.000000 0.000000 3 …
4
votes
1 answer

quantile vs ecdf results

I am trying to use ecdf, but I am not sure if I am doing it right. My ultimate purpose is to find what quantile corresponds to a specific value. As an example: sample_set <- c(20, 40, 60, 80, 100) # Now I want to get the 0.75 quantile: quantile(x =…
Max_IT
  • 492
  • 4
  • 14
4
votes
2 answers

How to draw multiple CDF plots of vectors with different number of rows

I want to draw the CDF plot of multiple variables in the same graph. The length of the variables are different. To simplify the detail, I use the following example code: library("ggplot2") a1 <- rnorm(1000, 0, 3) a2 <- rnorm(1000, 1, 4) a3 <-…
Excalibur
  • 411
  • 6
  • 19
3
votes
0 answers

ECDF plot in ggplot2 without expanding count variable

I have a dataframe which looks like Height Count 173 2 184 3 193 1 Usually, to plot an empirical cumulative distribution function, one: 1) expands the dataframe by using e.g. splitstackshape's expandRows function to obtain the…
Jackk
  • 155
  • 5
3
votes
1 answer

get the derivative of an ECDF

Is it possible to differentiate an ECDF? Take the one obtained in the following for example example. set.seed(1) a <- sort(rnorm(100)) b <- ecdf(a) plot(b) I would like to take the derivative of b in order to obtain its probability density…
MaxPlank
  • 125
  • 2
  • 9
3
votes
1 answer

Input to fit a power-law to degree distribution of a network

I would like to use R to test whether the degree distribution of a network behaves like a power-law with scale-free property. Nonetheless, I've read different people doing this in many different ways, and one confusing point is the input one should…
rafa.pereira
  • 10,729
  • 4
  • 59
  • 88
1
2 3
8 9