Questions tagged [ecdf]

Empirical Cumulative Distribution Function in statistics

For definition please see its Wikipedia page.

In software, a built-in function ecdf takes a vector of samples and generates its ECDF. It is also easy to produce it ourselves, as given in this example: How to derive an ecdf function?

131 questions
3
votes
2 answers

How to plot a Complementary Cumulative Distribution Function (CCDF) in R (preferbly in ggplot)?

Here is my code and my output (CDF): install.packages("ggplot2") library(ggplot2) chol <- read.table(url("http://assets.datacamp.com/blog_assets/chol.txt"), header = TRUE) df <- data.frame(x = chol$AGE) ggplot(df, aes(x)) + stat_ecdf() I'd like…
Übel Yildmar
  • 411
  • 7
  • 21
3
votes
1 answer

Empirical CDF function `ecdf` does not work for an "xts" time series

I am trying to plot the Empirical CDF of the daily returns distribution of S&P500 data. Below is the code I am trying to use. But as soon as I try to plot the ECDF, the graph doesn't look anything like a CDF graph. Please help me understand what I…
Deb
  • 195
  • 1
  • 2
  • 9
3
votes
0 answers

unexpected endpoint behavior in ggplot2::stat_ecdf()

I have some data for a gain chart. (I have percentiles of modeled scores for all the target outcomes.) > dput(data) structure(list(obs_set = structure(c(1L, 1L, 2L, 2L, 2L, 2L, 1L, 1L, 1L, 2L, 1L, 1L, 2L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,…
C8H10N4O2
  • 15,256
  • 6
  • 74
  • 113
3
votes
1 answer

Reverse x axis in ecdf plot using ggplot

How can I plot the reverse xaxis for ecdf using ggplot() function (not the qplot() function)? The following code does not work: test1 <- structure(list(ID = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 9L, 10L, 51L, 52L, 53L, 54L, 55L, 56L, 57L, 58L, 59L,…
user30314
  • 143
  • 1
  • 10
3
votes
3 answers

Log Log Probability Chart in R

I'm sure this is easy, but I've been tearing my hair out trying to find out how to do this in R. I have some data that I am trying to fit to a power law distribution. To do this, you need to plot the data on a log-log cumulative probability chart.…
3
votes
2 answers

How do I extract ecdf values out of ecdfplot()

If I use the ecdfplot() function of the latticeExtra package how do I get the actual values calculated i.e. the y-values which correspond to the ~x|g input? I've been looking at ?ecdfplot but there's not discription to it. For the usual highlevel…
Druss2k
  • 305
  • 1
  • 5
  • 14
2
votes
1 answer

How to simulate data from a logit model

I have a logistic regression, and I would like to generate simulated data from the logit curve. My code is below: #Begin Code require(gld) runs<-100 num.trees<-500 p<-0.5 trial.1<-rgl(num.trees,1859.75592, 0.02179,…
jtgarcia
  • 21
  • 3
2
votes
2 answers

how to find x value on a intersection point of axhline on seaborn ecdf plot?

I have a ecdf plot like this: penguins = sns.load_dataset("penguins") fig, ax = plt.subplots(figsize = (10,8)) sns.ecdfplot(data=penguins, x="bill_length_mm", hue="species") ax.axhline(.25, linestyle = '--', color ='#cfcfcf', lw = 2, alpha =…
JaySabir
  • 156
  • 9
2
votes
0 answers

Plot ECDF without loading all data in memory

I need to plot the ECDF of some data. I found out I could do it with ecdf = sm.distributions.ECDF(sample) x = np.linspace(min(sample), max(sample)) y = ecdf(x) plt.step(x, y) using the matplotlib and statsmodels Python packages. My problem is…
2
votes
1 answer

Plot ecdf and density in the same plot and zoom in to specific part

I want to plot the density and ecdf in a same plot using ggplot2. I wrote a code here library(ggplot2) library(reshape) set.seed(101) var1 = rnorm(1000, 0.5) var2 = rnorm(100000,0.5) combine = melt(data.frame("var1" = var1,"var2"=…
user3978632
  • 263
  • 3
  • 13
2
votes
1 answer

CDF beyond range of values in R ggplot2

I am trying to plot the CDF using ggplot2 in R and I get the following plot But the min and max values of the data are 1947 and 2017. I do not want the line to be plot beyond the ranges [1947, 2017]. ggplot(df, aes(x=year)) + stat_ecdf(geom="line")…
Dinesh
  • 2,104
  • 2
  • 24
  • 44
2
votes
1 answer

How do I scales the axes to the larger vector when plotting two ecdfs for comparison in R?

Initially I start out with 2 vectors (subsets of my data). I run ecdf on both, plot them in the same plot for ease of comparison. All of that is fine but what I need to know is how to make the function work universally for any pair of vectors, so I…
m9000
  • 35
  • 7
2
votes
1 answer

Create a ggplot2 stat_ecdf plot with standard error shading

I have data from three doses of a treatment, with three replicates per each dose: df <-…
dan
  • 4,868
  • 4
  • 35
  • 85
2
votes
2 answers

Plot ECDF data with ggplot2

I've a normalize data to plot ecdf but I couldn't change line shape, color and legend info. My Data is: EDCF.df <- structure(list(Length = c(11431L, 138250L, 109935L, 7615L, 5221L, 8741L, 9460L, 3102L, 2662L, 12286L, 5097L,…
eabanoz
  • 251
  • 2
  • 16
2
votes
2 answers

Calculate a percentile of dataframe column efficiently

This question is an extension to the StackOverflow question asked and answered here. My circumstances are different in that I want to calculate the percentile of each value within a vector of 50,000 (or more!) values. For example -- df <-…
AQS
  • 23
  • 1
  • 3
1
2
3
8 9