72

I'm a beginner in mathematics and there is one thing that I've been wondering about recently. The formula for the normal distribution is:

$$f(x)=\frac{1}{\sqrt{2\pi\sigma^2}}e^{-\displaystyle{\frac{(x-\mu)^2}{2\sigma^2}}},$$

However, what are $e$ and $\pi$ doing there? $\pi$ is about circles and the ratio to its diameter, for example. $e$ is mostly about exponential functions, specifically about the fact that $\frac{\mathrm{d}}{\mathrm{d}x} e^x = e^x$.

It is my firm conviction that proofs and articles are available, but could someone perhaps shed some light on this and please explain in a more 'informal' language what they stand for here?

I'm very curious to know as those numbers have very different meanings as far as I'm concerned.

Arnav Borborah
  • 227
  • 4
  • 16
pimvdb
  • 1,213
  • 2
  • 12
  • 15
  • 9
    It's probably worth bearing in mind that, as $e^{2\pi i}=1$ (see http://en.wikipedia.org/wiki/Euler%27s_identity), the numbers $e$ and $2\pi$ are actually very closely linked. – George Lowther Mar 22 '11 at 21:51
  • 3
    $\pi$ and $e$ have the same meaning as usual, i.e. $\pi \approx 3.1415926$ and $e \approx 2.718$. – Alex Becker Mar 22 '11 at 21:52
  • 2
    @George Lowther: Thanks very much for your link. @Alex: I see, but I was actually looking as to what is the connection between a circle/the number $e$ and the normal distribution. – pimvdb Mar 22 '11 at 22:01
  • 3
    @George: I don't think that Euler's identity is of any help here. See also my answer below. – vonjd Aug 07 '11 at 15:11

9 Answers9

92

So I think you want to know "why" $\pi$ and $e$ appear here based on an explanation that goes back to circles and natural logarithms, which are the usual contexts in which one first sees these.

If you see $\pi$, you think there's a circle hidden somewhere. And in fact there is. As has been pointed out, in order for this expression to give a probability density you need $\int_{-\infty}^\infty f(x) \: dx = 1$. (I'm not sure how much you know about integrals -- this just means that the area between the graph of $f(x)$ and the $x$-axis is 1.) But it turns out that this can be derived from $\int_{-\infty}^\infty e^{-x^2} dx = \sqrt{\pi}$.

And it turns out that this is true because the square of this integral is $\pi$. Now, why should the square of this integral have anything to do with circles? Because it's the total volume between the graph of $e^{-(x^2+y^2)}$ (as a function $g(x,y)$ of two variables) and the $xy$-plane. And of course $x^2+y^2$ is just the square of the distance of $(x,y)$ from the origin -- so the volume I just mentioned is rotationally symmetric. (If you know about multiple integration, see the Wikipedia article "Gaussian integral", under the heading "brief proof" to see this volume worked out.)

As for where $e$ comes from -- perhaps you've seen that the normal probability density can be used to approximate the binomial distribution. In particular, the probability that if we flip $n$ independent coins, each of which has probability $p$ of coming up heads, we'll get $k$ heads is $$ {n \choose k} p^{k} (1-p)^{n-k} $$ where ${n \choose k} = n!/(k! (n-k)!)$. And then there's Stirling's approximation, $$ n! \approx \sqrt{2\pi n} (n/e)^{n}. $$ So if you can see why $e$ appears here, you see why it appears in the normal. Now, we can take logs of both sides of $n! = 1 \cdot 2 \cdot \ldots \cdot n$ to get $$ \log (n!) = \log 1 + \log 2 + \cdots + \log n $$ and we can approximate the sum by an integral, $$ \log (n!) \approx \int_{1}^{n} \log t \: dt. $$ But the indefinite integral here is $t \log t - t$, and so we get the definite integral $$ \log (n!) \approx n \log n - n. $$ Exponentiating both sides gives $n! \approx (n/e)^n$. This is off by a factor of $\sqrt{2\pi n}$ but at least explains the appearance of $e$ -- because there are logarithms in the derivation. This often occurs when we deal with probabilities involving lots of events because we have to find products of many terms; we have a well-developed theory for sums of very large numbers of terms (basically, integration) which we can plug into by taking logs.

hlapointe
  • 1,530
  • 1
  • 14
  • 26
Michael Lugo
  • 21,052
  • 3
  • 42
  • 87
22

One of the important operations in (continuous) probability is the integral. $e$ shows up there just because it's convenient. If you rearrange it a little you get $$ {1 \over \sqrt{2\pi \sigma^2}} (e^{1 \over 2\sigma^2})^{-(x-\mu)^2},$$ which makes it clear that the $e$ is just a convenient number that makes the initial constant relatively straightforward; using some other number in place of $e$ just rescales $\sigma$ in some way.

The $\pi$ is a little tougher to explain; the fact you just have to "know" (because it requires multivariate calculus to prove) is that $\int_{-\infty}^{\infty} e^{-x^2} dx = \sqrt{\pi}$. This is called the Gaussian integral, because Gauss came up with it. It's also why this distribution (with $\mu = 0, \sigma^2 = 1/2$) is called the Gaussian distribution. So that's why $\pi$ shows up in the constant, so that no matter what values you use for $\sigma$ and $\mu$, $\int_{-\infty}^{\infty} f(x) dx = 1$.

Paul Z
  • 1,043
  • 6
  • 10
  • 4
    Derivation of this formula is given at http://math.stackexchange.com/questions/9286/proving-int-0-infty-e-x2-dx-frac-sqrt-pi2/9292#9292 and much info is in in Wikipedia at http://en.wikipedia.org/wiki/Gaussian_distribution – Ross Millikan Mar 22 '11 at 22:09
17

Let's consider the more general form $$ f(x) = C \alpha^{(x-\mu)^2}. $$ For that to be a probability distribution, we need $$ \int_{-\infty}^\infty f(x) \, dx = 1. $$ This gives us one constraint. Given this, the mean will always be $\mu$, by symmetry. If we want a variance of $\sigma^2$ then we need $$ \int_{-\infty}^\infty f(x) (x-\mu)^2 \, dx = \sigma^2. $$ This gives us another constraint, and allows us to solve for $C,\alpha$, and we get the formula that you mentioned.

More verbosely, if we ignore $C$ then the second equation reads $$ \int_{-\infty}^\infty \alpha^{(x-\mu)^2} (x-\mu)^2 \, dx = \sigma^2 \int_{-\infty}^\infty \alpha^{(x-\mu)^2} \, dx. $$ In other words, $$ \int_{-\infty}^\infty \alpha^{(x-\mu)^2} [(x-\mu)^2 - \sigma^2] \, dx = 0.$$ Clearly $\alpha$ doesn't depend on $\mu$, so we can put $\mu=0$: $$ \int_{-\infty}^\infty \alpha^{x^2} [x^2 - \sigma^2] \, dx = 0.$$ From this you can calculate that $\alpha = \exp(-1/2\sigma^2)$. This calculation implicitly uses the fact that the indefinite integral of $\exp(x)$ is $\exp(x)$.

Having found $\alpha$, we can find $C$ by computing another integral. There's a trick for doing this that avoids using complex residues. This trick computes the square of $$ \int_{-\infty}^\infty \exp(-x^2/2) \, dx $$ by doing a polar change of variables. The quantity $\pi$ then comes out as the length of the interval of angles - basically the length of the circumference of a unit circle.

Yuval Filmus
  • 55,550
  • 5
  • 87
  • 159
  • 1
    Why can we conclude that $\alpha$ doesn't depend on $\mu$ and how can you calculate the integral $\int_{-\infty}^{\infty} \alpha^{x^{2}}(x^2-\sigma^2) \: dx = 0$. Thanks for your explanation, they're really help me to understand. – hlapointe Nov 20 '15 at 19:33
  • 2
    If we fix $C$ then the value of $\alpha$ that work doesn't depend on $\mu$, as you can see by a substitution $x':=x-\mu$. Regarding the integral, try to use integration by parts and/or substitution. – Yuval Filmus Nov 20 '15 at 19:53
11

There is nothing special about $e$ here - it just affects the horizontal scaling of the function and makes the calculations more convenient.

The interesting part is $\pi$. Every time it shows up you can suspect that somewhere circles are involved (and may that be deeply hidden within the structure). And indeed this is the case here!

To be precise we are talking not about $\pi$ but about $\sqrt{\pi}$ here. So the circles show up when you square the function which means that you go one dimension up. It follows that in order to see the circles we must look at the two dimensional normal curve:

surface plot

When you look at the contour lines you will see the circles!

surface and contours

This will become even clearer when you have a look at the contour plot!

contour plot

Sources:
(1) The plots were created with WolframAlpha.
(2) Full details about the derivation can be found in this wonderful (and easy to follow) book: Strange curves, counting rabbits, and other mathematical explorations By Keith M. Ball, p. 100-105.

J. M. ain't a mathematician
  • 71,951
  • 6
  • 191
  • 335
vonjd
  • 8,348
  • 10
  • 47
  • 74
  • 1
    This is also interesting. What I also found surprising is that the complete volume under ${a}^{-(x^2+y^2)}$ is $\pi/ln(a)$ - so $\pi$ and $e$ do have an interesting relation to each other. – pimvdb Aug 07 '11 at 15:05
  • 1
    Yes, this division by the logarithm is due to the horizontal rescaling I mentioned above. That a logarithm shows up is just a consequence of "bringing down" the exponent. The natural logarithm is a convenient closed form. But it is true: $e$ keeps showing up - I have to think about that but at the moment I think that is because of its deep connection to differentiation/integration in general, and that is what we are doing here after all. So I don't think that this special curve really connects $\pi$ and $e$. But always interesting to think about these basic ideas that are so fundamental! – vonjd Aug 07 '11 at 15:28
  • 1
    A further indication that there is no special connection between the two constants via this formula is when you use the general base $a$ but don't square the exponents: Integration won't give you $\pi$ because it is not rotationally symmetric any more (just plot it) - but $e$ still crops up (via the natural log) due to the integration operation. – vonjd Aug 07 '11 at 16:43
8

De Morgan and the Actuary

De Morgan was explaining to an actuary what was the chance that a certain proportion of some group of people would at the end of a given time be alive; and quoted the actuarial formula, involving $\pi$, which, in answer to a question, he explained stood for the ratio of the circumference of a circle to its diameter. His acquaintance, who had so far listened to the explanation with interest, interrupted him and exclaimed, 'My dear friend, that must be a delusion, what can a circle have to do with the number of people alive at a given time?'

-- W.W.R. Ball
Mathematical Recreations and Problems (1896), 180; See also De Morgan's Budget of Paradoxes (1872), 172.

LINK

GEdgar
  • 96,878
  • 7
  • 95
  • 235
6

S. P. Thompson:

Once when lecturing in class he [the Lord Kelvin] used the word 'mathematician' and then interrupting himself asked his class: 'Do you know what a mathematician is?' Stepping to his blackboard he wrote upon it: $$\int_{-\infty}^\infty e^{-x^2}=\sqrt{\pi}$$ Then putting his finger on what he had written, he turned to his class and said, 'a mathematician is one to whom that is as obvious as that twice two makes four is to you.'

Source: Link

vonjd
  • 8,348
  • 10
  • 47
  • 74
4

$e$ and $\pi$ often show up in mathematics in a variety of areas. At times there's an intuitive and logical explanation, and at other times there aren't.

One interesting thing about the Gaussian, though: its Fourier transform is itself also a Gaussian. Considering the frequency domain is used to describe cyclic/periodic behavior, this says something about the behavior of any system that adheres to a Gaussian distribution and its behavior over time.

In particular, a Gaussian process is stationary in that a set of samples taken from one period of time should resemble a sample taken from a different period of time.

The Wikipedia article on the normal distribution says:

More generally, a normal distribution results from exponentiating a quadratic function (...): $f(x)=e^{a x^2 + b x + c}$

... where $a$ ends up being negative. What you're looking at is a function that calculates a probability, not a function that calculates the random variables. Though I give it without any motivation, an expression given in the Wikipedia article is somewhat illuminating (from the section near the end for sampling from the Gaussian distribution):

$\begin{align*}X&=\sqrt{-2 \ln(U)} \cos(2\pi V)\\Y&=\sqrt{-2\ln(U)} \sin(2\pi V)\end{align*}$

... where $U$ and $V$ are uniformly distributed on $(0,1]$.

This reveals the cyclic components I mentioned, and the term under the radicand uses $\ln(x)$ the inverse of $e^x$.

Looking at a plot of the function for $X$ above (with a few terms removed) shows that the samples almost always have values near zero, with a very small portion of them diverging to $\pm\infty$. This is important because a Gaussian process will exhibit behavior near that of its mean value most of the time, and the argument I'm making supports this statement.

Therefore, my interpretation (and I obviously give it without any proof) is this:

  • $e$ shows up as a consequence of the samples being damped by the $\ln(U)$ term.
  • $\pi$ shows up because the samples exhibit cyclic behavior.

There's technical reasons they show up, as pointed out by others. Probably far more than have been listed. However, I'm a firm believer that math isn't just about being able to give a proof for something, but rather understanding what the math actually describes and then being able to apply the concepts necessary for the proof to come up with the result.

J. M. ain't a mathematician
  • 71,951
  • 6
  • 191
  • 335
Brian Vandenberg
  • 892
  • 2
  • 12
  • 15
2

I will supplement the derivation of the formula $$ \int_{\mathbb{R}}e^{-x^2} dx = \sqrt{\pi},$$ which, I think at least, is the center of your question. We start with $$ (\int_{\mathbb{R}}e^{-x^2} dx)^2= \int_{\mathbb{R}}e^{-x^2} dx \int_{\mathbb{R}}e^{-y^2} dy = \int_{\mathbb{R}^2}e^{-(x^2+y^2)} dx dy $$ Now we switch to polar coordinates: $$\int_{\mathbb{R}^2}e^{-(x^2+y^2)} dx dy = \int_{0}^{2 \pi}\underbrace{\int_{0}^{\infty} e^{-r^2} r dr }_{=1/2}\; d \theta =\pi.$$

Marc Palm
  • 4,489
  • 15
  • 43
-2

enter image description here

Gaussian Normal Distribution Proof 01enter image description here

enter image description here