105

Is there an exact or good approximate expression for the expectation, variance or other moments of the maximum of $n$ independent, identically distributed gaussian random variables where $n$ is large?

If $F$ is the cumulative distribution function for a standard gaussian and $f$ is the probability density function, then the CDF for the maximum is (from the study of order statistics) given by

$$F_{\rm max}(x) = F(x)^n$$

and the PDF is

$$f_{\rm max}(x) = n F(x)^{n-1} f(x)$$

so it's certainly possible to write down integrals which evaluate to the expectation and other moments, but it's not pretty. My intuition tells me that the expectation of the maximum would be proportional to $\log n$, although I don't see how to go about proving this.

Chris Taylor
  • 27,485
  • 5
  • 79
  • 121
  • I presume you are interested in the large $n$ regime ? – Sasha Dec 06 '11 at 21:26
  • @Sasha yes, I'll edit to include that – Chris Taylor Dec 06 '11 at 21:38
  • You might be interested in this related question: [Does exceptionalism persist as sample size gets large?](http://math.stackexchange.com/questions/24743/does-exceptionalism-persist-as-sample-size-gets-large) – Mike Spivey Dec 07 '11 at 05:28
  • Note: the answers to [this related question](http://cstheory.stackexchange.com/questions/14530/balls-and-bins-analysis-in-the-m-gg-n-regime-gaps) on cstheory.stackexchange are useful in answering your question. – Neal Young Dec 04 '12 at 23:31
  • More generally, the expectation and variance of the range depends on how fat the tail of your distribution is. For the variance, it is $O(n^{-B})$ where $B$ depends on your distribution ($B = 2$ for uniform, $B = 1$ for Gaussian, and $B = 0$ for exponential.) – Vincent Granville May 24 '19 at 23:30

2 Answers2

85

How precise an answer are you looking for? Giving (upper) bounds on the maximum of i.i.d Gaussians is easier than precisely characterizing its moments. Here is one way to go about this (another would be to combine a tail bound on Gaussian RVs with a union bound).

Let $X_i$ for $i = 1,\ldots,n$ be i.i.d $\mathcal{N}(0,\sigma^2)$.

Defining, $$ Z = [\max_{i} X_i] $$

By Jensen's inequality,

$$\exp \{t\mathbb{E}[ Z] \} \leq \mathbb{E} \exp \{tZ\} = \mathbb{E} \max_i \exp \{tX_i\} \leq \sum_{i = 1}^n \mathbb{E} [\exp \{tX_i\}] = n \exp \{t^2 \sigma^2/2 \}$$

where the last equality follows from the definition of the Gaussian moment generating function (a bound for sub-Gaussian random variables also follows by this same argument).

Rewriting this,

$$\mathbb{E}[Z] \leq \frac{\log n}{t} + \frac{t \sigma^2}{2} $$

Now, set $t = \frac{\sqrt{2 \log n}}{\sigma}$ to get

$$\mathbb{E}[Z] \leq \sigma \sqrt{ 2 \log n} $$

Sivaraman
  • 871
  • 5
  • 4
  • 5
    The reason Sivaraman set t = \sqrt{2\log{n}}/\sigma is because that is the point at which the upper bound is at a minimum. You can see this by taking the derivative of the bound with respect to t and setting it to zero. – SigmaX Nov 02 '14 at 17:15
  • 26
    I find it interesting that this doesn't need the independence assumption. – Arun Dec 10 '14 at 18:12
  • 1
    Can we similarly prove the lower bound? I've trying to use this hint in one of my exercises that $P(Z\geq t) = 1- P(X_1 \leq t)^n$. – pikachuchameleon Mar 25 '16 at 15:16
  • 1
    This uses the Cramer-Chernoff method. For completeness and reference, the proof provided above appears as a special case in Pascal Massart: "Concentration inequalities and model selection", p. 17f, http://link.springer.com/10.1007/978-3-540-48503-2 – user32849 Jul 07 '17 at 15:56
  • May I ask what is the variance of $Z$ in that case? – nullgeppetto Dec 14 '18 at 03:04
  • empirically, with 1.3-1.5 instead of 2 it seems to be a _very_ accurate estimate of the mean for large n! – Ben Usman Jun 02 '19 at 04:06
  • Can we also show a lower bound? – Daniel Xiang Jun 20 '19 at 18:27
  • 6
    Here's a proof of a lower bound: http://www.gautamkamath.com/writings/gaussian_max.pdf – Uthsav Chitra Jun 26 '19 at 02:10
  • 1
    @Arun maybe because the independence gives the worst case upper bound? – John Jiang Jul 29 '20 at 00:04
70

The $\max$-central limit theorem (Fisher-Tippet-Gnedenko theorem) can be used to provide a decent approximation when $n$ is large. See this example at reference page for extreme value distribution in Mathematica.

The $\max$-central limit theorem states that $F_\max(x) = \left(\Phi(x)\right)^n \approx F_{\text{EV}}\left(\frac{x-\mu_n}{\sigma_n}\right)$, where $F_{EV} = \exp(-\exp(-x))$ is the cumulative distribution function for the extreme value distribution, and $$ \mu_n = \Phi^{-1}\left(1-\frac{1}{n} \right) \qquad \qquad \sigma_n = \Phi^{-1}\left(1-\frac{1}{n} \cdot \mathrm{e}^{-1}\right)- \Phi^{-1}\left(1-\frac{1}{n} \right) $$ Here $\Phi^{-1}(q)$ denotes the inverse cdf of the standard normal distribution.

The mean of the maximum of the size $n$ normal sample, for large $n$, is well approximated by $$ \begin{eqnarray} m_n &=& \sqrt{2} \left((\gamma -1) \Phi^{-1}\left(2-\frac{2}{n}\right)-\gamma \Phi^{-1}\left(2-\frac{2}{e n}\right)\right) \\ &=& \sqrt{\log \left(\frac{n^2}{2 \pi \log \left(\frac{n^2}{2\pi} \right)}\right)} \cdot \left(1 + \frac{\gamma}{\log (n)} + \mathcal{o} \left(\frac{1}{\log (n)} \right) \right) \end{eqnarray}$$ where $\gamma$ is the Euler-Mascheroni constant.

Thomas Ahle
  • 3,973
  • 18
  • 38
Sasha
  • 68,169
  • 6
  • 133
  • 210
  • 5
    +1. See also Section 10.5 ("The Asymptotic Distribution of the Extreme") in David and Nagaraja's *Order Statistics*. They explicitly discuss the normal distribution on page 302. – Mike Spivey Dec 06 '11 at 22:35
  • @MikeSpivey Yes, I meant $\max$-central limit theorem. I have edited the post to precise that. Thank you. – Sasha Dec 06 '11 at 22:51
  • 1
    Doesn't the inverse cdf have domain $[0,1]$? – Geoffrey Irving Dec 30 '12 at 07:01
  • @GeoffreyIrving Thanks for catching this. It is a typo. – Sasha Dec 30 '12 at 14:39
  • 9
    (+1) Two comments: (1) The somewhat nonstandard use of $Q$ for the inverse normal is a little unfortunate given that it *is* a standard notation in some contexts for the *upper-tail distribution* of the standard normal $\mathbb P(Z \geq z)$. I would suggest $\Phi^{-1}$ instead. (2) As you know, convergence in distribution doesn't imply convergence of moments, in general; but, in the case of extreme values of iid random variables it does (curiously enough). This was proved in [Pickands (1968)](http://projecteuclid.org/euclid.aoms/1177698320). – cardinal Dec 30 '12 at 16:19
  • 1
    An example, illustrating second point can be found [here](http://math.stackexchange.com/questions/153293/does-convergence-in-distribution-implies-convergence-of-expectation). – Sasha Dec 17 '14 at 05:37
  • 2
    Unless I misunderstood something, the first line of your expression for $m_n$ is negative – Glen_b Jan 16 '17 at 02:29
  • 4
    The first expression for $m_n$ should be $(1-\gamma)*\Phi^{-1}(1-1/n) + \gamma\Phi^{-1}(1-1/(en))$, which is the mean of the extreme value distribution with the given parameters $\mu_n$ and $\sigma_n$. – Twan van Laarhoven Jan 09 '18 at 16:46
  • Can we say something about the speed of convergence? The $1+O(1/\log n)$ convergence for $m_n$ looks nice, but to use that as an approximation we need to assume $F_{\max_n(x)} = F_{\text{EV}}(\frac{x-\mu_n}{\sigma_n})$ rather than just $\approx$.. – Thomas Ahle Jul 14 '20 at 16:39
  • 1
    Nice answer! I was wondering is there a standard such approximation for $F_{min}(x)=$ CDF of minimum of $\{X_1 \dots X_n\}, X_i \sim_{iid} \mathcal{N}(0,1)?$ We know that $F_{min}(x)= 1 - (1 - \Phi(x))^n,$ and we note from your answer that for large $n, \Phi(x) ^n \approx exp(- exp (-(\frac{x- \mu _n}{\sigma_n})).$ So to approximate $F_{min}(x),$ should we just approximate: $\Phi(x)=(exp(- exp (-(\frac{x- \mu _n}{\sigma_n})))^{1/n}$ and then plug in $F_{min}(x)= 1 - (1 - \Phi(x))^n,$ to approximate $F_{\min}(x)?$ – Learning Math Aug 21 '20 at 20:24