6

Given the formula for the density of the multivariate gaussian:

$$f_Y(x)=\frac{1}{\sqrt{(2\pi)^n|\boldsymbol\Sigma|}} \exp\left(-\frac{1}{2}({x}-{m})^T{\boldsymbol\Sigma}^{-1}({x}-{m}) \right)$$

Can anybody tell me what the rationale is behind using the determinant in the square root term in the denominator? In the univariate case you divide by the standard deviation, why do you have to use the determinant in the multivariate case? What relation is there between some measure of deviation and the determinant of the covariance matrix?

eager2learn
  • 2,709
  • 3
  • 26
  • 40

3 Answers3

4

Because we want to make the integral of the density over $R^n$ equal to 1.

More precisely, if we compute $\int_{R^n}\exp\left(-\frac{1}{2}({x}-{m})^T{\boldsymbol\Sigma}^{-1}({x}-{m}) \right)dx$, firstly we will make a variable change $y = \Sigma^{-1/2}(x-m)$, then we have

$$\int_{R^n}\exp\left(-\frac{1}{2}({x}-{m})^T{\boldsymbol\Sigma}^{-1}({x}-{m}) \right)dx = {\sqrt{|\Sigma|}}\int_{R^n}\exp\left(-\dfrac{1}{2}|y|^2\right)dy$$

Now suppose we know the value of $\int_{R^n}\exp\left(-\dfrac{1}{2}|y|^2\right)dy$, which is equal to ${\sqrt{(2\pi)^n}}$, then we know which number we should use to normalize $\int_{R^n}\exp\left(-\frac{1}{2}({x}-{m})^T{\boldsymbol\Sigma}^{-1}({x}-{m}) \right)dx $, that's why $\sqrt{|\Sigma|}$ appears

Petite Etincelle
  • 14,360
  • 1
  • 29
  • 58
  • Thanks for the answer. Can you explain why you can factor out the square root of the determinant from the integral? – eager2learn Oct 18 '14 at 15:46
  • 2
    @eager2learn roughly speaking, if $y = \Sigma^{-1/2}(x-m)$, i.e. $x = \Sigma^{1/2}y + m$, we should have $dx = |\Sigma^{1/2}|dy = \sqrt{|\Sigma|}dy$. It's the rule you should follow when making variable change in multi-dimensions, see [here](http://en.wikipedia.org/wiki/Integration_by_substitution) – Petite Etincelle Oct 18 '14 at 15:50
2

In the diagonal case, the determinant of $\Sigma$ is the product of its diagonal entries and each diagonal entry is the variance $\sigma_i^2$ of the $i$th coordinate hence, to divide by $\sqrt{|\Sigma|}$ is to divide by the product of the standard deviations $\sigma_i$, as desired.

Said otherwise, $\sqrt{(2\pi)^n|\Sigma|}$ is the product of the normalization factors $\sqrt{2\pi\sigma_i^2}$ hence it is the correct normalization factor.

The general, non-diagonal, case is analogous.

Did
  • 271,033
  • 27
  • 280
  • 538
2

Here's one way to look at it: \begin{align} f_Y(x) \, dx & =\frac{1}{\sqrt{(2\pi)^n}} \exp\left(-\frac{1}{2}\Big({\boldsymbol\Sigma}^{-1/2}(x-m)^T\Big)^T \Big( {\boldsymbol\Sigma}^{-1/2} (x-m) \right) \, \frac{dx}{|\boldsymbol{\Sigma}^{-1/2}|} \\[12pt] & = \frac{1}{\sqrt{(2\pi)^n}} \exp\left( -\frac 1 2 (u-\ell)^T(u-\ell) \right) \, du \end{align}

An infinitely small volume $dx = dx_1\cdots dx_n$ is transformed by multiplication by the matrix $\boldsymbol{\Sigma}^{-1/2}$ to a region of volume $|\Sigma^{-1/2}|\,du_1\cdots du_n$. That is what determinants of matrices do.

For this to make sense it is necessary to know that $\boldsymbol{\Sigma}$ and its inverse are positive-definite symmetric matrices and that every such matrix has a positive-definite symmetric square root. That follows from the spectral theorem of linear algebra.

Michael Hardy
  • 1
  • 30
  • 276
  • 565