11

Show that the entropy of the multivariate Gaussian $N(x|\mu,\Sigma)$ is given by \begin{align} H[x] = \frac12\ln|\Sigma| + \frac{D}{2}(1 + \ln(2\pi)) \end{align} where $D$ is the dimensionality of $x$.

My solution.

Entropy for normal distribution:

\begin{align} H[x] = -\int_{-\infty}^{+\infty}N(x|\mu,\Sigma)\ln(N(x|\mu,\Sigma)) dx = &&\text{by definition of entropy}\\ = -E[\ln(N(x|\mu,\Sigma))] =\\ = -E[\ln((2\pi)^{-\frac{D}{2}} |\Sigma|^{-\frac12} e^{-\frac12(x - \mu)^T\Sigma^{-1}(x - \mu)})] = &&\text{definition of multivariable gaussian}\\ = \frac{D}{2}\ln(2\pi) + \frac12\ln |\Sigma| + \frac12E[(x - \mu)^T\Sigma^{-1}(x - \mu)] &&\text{the log of a product is the sum of the logs}. \end{align}

Consider the third term:

\begin{align} \frac12E[(x - \mu)^T\Sigma^{-1}(x - \mu)] = \\ = \frac12E[x^T\Sigma^{-1}x - x^T\Sigma^{-1}\mu - \mu^T\Sigma^{-1}x + \mu^T\Sigma^{-1}\mu] = \\ = \frac12E[x^T\Sigma^{-1}x] - \frac12E[2\mu^T\Sigma^{-1}x] + \frac12E[\mu^T\Sigma^{-1}\mu] = \\ = \frac12E[x^T\Sigma^{-1}x] - \mu^T\Sigma^{-1}E[x] + \frac12\mu^T\Sigma^{-1}\mu = \\ = \frac12E[x^T\Sigma^{-1}x] - \mu^T\Sigma^{-1}\mu + \frac12\mu^T\Sigma^{-1}\mu = &&\text{Since $E[x] = \mu$}\\ = \frac12E[x^T\Sigma^{-1}x] - \frac12\mu^T\Sigma^{-1}\mu \end{align}

How can I simplify the term: $E[x^T\Sigma^{-1}x]$ ?

Andreo
  • 927
  • 1
  • 7
  • 15
  • 1
    1. When working with Gaussians, usually it makes for easier integrals if one leaves terms like $(x- \mu)$ alone instead of pulling $\mu$ out. 2. Assuming you follow $1$, try the substitution $z = \Sigma^{-1/2} (x - \mu)$ in the integral $\int_{\mathbb{R}^D} (x - \mu)^T \Sigma^{-1} (x-\mu) \exp \left( (x - \mu)^T \Sigma^{-1} (x-\mu) \right) \,\mathrm{d} x$ (note that $\Sigma$ is positive semi-definite, which makes life kinda easy here). – stochasticboy321 Nov 25 '16 at 05:26
  • Thanks! I got it: $\Sigma^{-1} = \sum_{i=1}^D \frac{1}{\lambda_i} e_i e_i^T$ Then: $(x - \mu)^T\Sigma^{-1}(x - \mu) = \sum_{i=1}^D \frac{1}{\lambda_i} (x - \mu)^T e_i e_i^T (x - \mu) = \sum_{i=1}^D \frac{y_i^2}{\lambda_i}$ Where: $y_i = e_i^T (x - \mu)$ - scalar. Then we can switch to $dy_i$ coordinates. And after simplification will get just $D$. – Andreo Nov 26 '16 at 07:39
  • @Andreo What are your $\lambda_i$? It seems like you're claiming that $\Sigma^{-1}$ is diagonal. – Eric Auld Feb 14 '18 at 03:51
  • @EricAuld $\lambda_i$ is an eigenvalue. $\Sigma^{-1}$ can be factorized as $U \Lambda U^T$, where $\Lambda$ is a diagonal of eigen-values and $U$ consists of eigen-vectors. – Andreo Feb 15 '18 at 21:42

1 Answers1

10

It's better to simplify the term $\mathbb{E}[(x-\mu)^T \Sigma^{-1}(x-\mu)]$ directly:

$$ \begin{align} \mathbb{E}[(x-\mu)^T \Sigma^{-1}(x-\mu)] &= \mathbb{E}[\mathrm{tr}((x-\mu)^T \Sigma^{-1}(x-\mu))]\\ &= \mathbb{E}[\mathrm{tr}(\Sigma^{-1}(x-\mu)(x-\mu)^T)]\\ &= \mathrm{tr}(\mathbb{E}[\Sigma^{-1}(x-\mu)(x-\mu)^T])\\ &= \mathrm{tr}(\Sigma^{-1}\mathbb{E}[(x-\mu)(x-\mu)^T])\\ &= \mathrm{tr}(\Sigma^{-1}\Sigma)\\ &= \mathrm{tr}(I)=D \end{align} $$

Kiuhnm
  • 646
  • 7
  • 14
  • Thanks, I added your snippet and the whole proof [here](https://statproofbook.github.io/P/mvn-dent). – Joram Soch Aug 26 '20 at 04:12
  • could you help with how to formulate a closed-form analytical solution for **weighted multivariate Gaussian entropy**? https://math.stackexchange.com/questions/3905805/how-to-formulate-and-simplify-weighted-differential-entropy-for-gaussian-random – develarist Nov 13 '20 at 16:04