I was reading on Wikipedia in this article about the n-dimensional and functional generalization of the Gaussian integral. In particular, I would like to understand how the following equations are derived: $$ \begin{eqnarray} & {} \quad \int x^{k_1}\cdots x^{k_{2N}} \, \exp\left( - \frac 1 2 \sum_{i,j=1}^{n}A_{ij} x_i x_j \right) \, d^nx \\ & = \sqrt{\frac{(2\pi)^n}{\det A}} \, \frac{1}{2^N N!} \, \sum_{\sigma \in S_{2N}}(A^{-1})^{k_{\sigma(1)}k_{\sigma(2)}} \cdots (A^{-1})^{k_{\sigma(2N-1)}k_{\sigma(2N)}} \end{eqnarray} $$


$$ \begin{eqnarray} \int f(\vec x) \, \exp\left( - \frac 1 2 \sum_{i,j=1}^{n}A_{ij} x_i x_j \right) d^nx = \sqrt{(2\pi)^n\over \det A} \, \left. \exp\left({1\over 2}\sum_{i,j=1}^{n}(A^{-1})_{ij}{\partial \over \partial x_i}{\partial \over \partial x_j}\right)f(\vec{x})\right|_{\vec{x}=0} \end{eqnarray} $$

The problem is that I can't find a book or internet source that shows a complete derivation of this .. any reference tip would be highly appreciated!

Another thing that I am not so sure about is what the complete symmetrization means in the above equation, if someone could give a short explanation (or a more detailed equation) of what the following expression means that would be great:

$$ \begin{eqnarray} \sum_{\sigma \in S_{2N}}(A^{-1})^{k_{\sigma(1)}k_{\sigma(2)}} \cdots (A^{-1})^{k_{\sigma(2N-1)}k_{\sigma(2N)}} \end{eqnarray} $$

Many thanks !!

  • 3,583
  • 7
  • 21
  • 42
  • 8,060
  • 5
  • 34
  • 70
  • 3
    The first one doesn't seem to make a lot of sense; you could just combine all the powers of $x$ into one -- it looks like you're missing some indices there? – joriki Mar 30 '12 at 11:40
  • @joriki: that's a good question I copied it straight from wikipedia - I don't really understand the notation either that's why I would love to derive it myself – harlekin Mar 30 '12 at 11:46
  • 1
    It seems the first equation just uses superscripts instead of subscripts as indices. If you interpret all the superscripts as indices, you can derive the first equation from the second one using $f(\vec x)=x^{k_1}\cdots x^{k_{2N}}$. – joriki Mar 30 '12 at 12:07
  • 1
    Regarding your second question, I'm not sure how you were able to correctly identify this as "complete symmetrization" without knowing what that means :-) It means that the sum runs over all permutations of the numbers from $1$ to $2N$, so you get $(2N)!$ terms, each with one possible assignment of the numbers $1$ to $2N$ to the values $\sigma(1)$ to $\sigma(2N)$. – joriki Mar 30 '12 at 12:11
  • aaah ok that makes sense .. now I understand the superscripts in the complete symmetrization denote the entry in the matrix $A^{-1}$. Again I copied the expression from Wikipedia where it says that this is called a complete symmetrization. Sorry if I sounded as if I knew - I didn't. – harlekin Mar 30 '12 at 12:50
  • Well I checked that Wikipedia article before I wrote that and it doesn't say anything about complete symmetrization -- are you referring to a different Wikipedia article that you haven't told us about yet? – joriki Mar 30 '12 at 13:11
  • @joriki just checked it - you're right it doesn't say it there, I must have picked this up from another source, I was looking at various sites but decided to only state the Wikipedia here as it would have cluttered the question otherwise. – harlekin Mar 30 '12 at 17:02

3 Answers3


The presentation here is typical of those used to model and motivate the infinite dimensional Gaussian integrals encountered in quantum field theory. I will use subscripts instead of superscripts to indicate components.

I. Wick's theorem

First consider the integral $$Z_0 = \int d^n x \exp\left(-\frac{1}{2} x^\mathrm{T} A x\right),$$ where $d^n x = \prod_i d x_i$ and $A$ is symmetric and positive definite. Diagonalize $A$ with an orthogonal transformation. Letting $D = M^\mathrm{T} A M = \mathrm{diag}(\lambda_1,\ldots,\lambda_n)$ and $z = M^\mathrm{T} x$, and noting that the Jacobian is the identity, we find $$\begin{eqnarray} Z_0 &=& \int d^n z \exp\left(-\frac{1}{2} z^\mathrm{T} D z\right) \\ &=& \prod_i \int d z_i \exp\left(-\frac{1}{2} \lambda_i z_i^2\right) \\ &=& \prod_i \sqrt{\frac{2\pi}{\lambda_i}} \\ &=& \sqrt{\frac{(2\pi)^n}{\det A}}. \end{eqnarray}$$

Add a source term $$Z_J = \int d^n x \exp\left(-\frac{1}{2} x^\mathrm{T} A x + J^\mathrm{T} x\right).$$ Complete the square to eliminate the cross term. Letting $x = y + A^{-1}J$, we find $$\begin{eqnarray} Z_J &=& \int d^n y \exp\left(-\frac{1}{2} {y}^\mathrm{T} A y + \frac{1}{2} J^\mathrm{T}A^{-1}J\right) \\ &=& \sqrt{\frac{(2\pi)^n}{\det A}} \exp\left(\frac{1}{2} J^\mathrm{T}A^{-1}J\right). \end{eqnarray}$$ There is always a factor of $\sqrt{\frac{(2\pi)^n}{\det A}}$. For convenience, let's define $$\langle x_{k_1} \cdots x_{k_{2N}}\rangle = \frac{1}{Z_0} \int d^n x \ x_{k_1} \cdots x_{k_{2N}} \exp\left(-\frac{1}{2} x^\mathrm{T} A x\right).$$ (Notice that $\langle x_{k_1} \cdots x_{k_{2N+1}}\rangle = 0$ since the integral is odd.) Roughly speaking, these are free field scattering amplitudes. This integral can be found by taking derivatives of $Z_J$, $$\langle x_{k_1} \cdots x_{k_{2N}}\rangle = \left.\frac{1}{Z_0} \frac{\partial}{\partial J_{k_1}} \cdots \frac{\partial}{\partial J_{k_{2N}}} Z_J\right|_{J=0}.$$ (Notice that every factor of $\partial/\partial_{J_{k_i}}$ brings down one factor of $x_{k_i}$ in the integral $Z_J$.) Using this formula it is a straightforward exercise to work out, for example, that $$\begin{eqnarray} \langle x_{k_1} x_{k_{2}}\rangle &=& \left.\frac{\partial}{\partial J_{k_1}} \frac{\partial}{\partial J_{k_{2}}} \exp\left(\frac{1}{2} J^\mathrm{T}A^{-1}J\right)\right|_{J=0} \\ &=& \frac{\partial}{\partial J_{k_1}} \frac{\partial}{\partial J_{k_{2}}} \frac{1}{2} J^\mathrm{T}A^{-1}J \\ &=& \frac{1}{2} ( A^{-1}_{k_1 k_2} + A^{-1}_{k_2 k_1}) \\ &=& A^{-1}_{k_1 k_2}, \end{eqnarray}$$ which agrees with the formula for $N=1$. This is the "free field propagator." (In the last step we have used the fact that $A^{-1}$ is symmetric.) It is possible to see by inspection that the formula for general $N$ is $$\langle x_{k_1}\cdots x_{k_{2N}}\rangle = \frac{1}{2^N N!} \sum_{\sigma \in S_{2N}}(A^{-1})_{k_{\sigma(1)}k_{\sigma(2)}} \cdots (A^{-1})_{k_{\sigma(2N-1)}k_{\sigma(2N)}}.$$ In fact $$\begin{eqnarray} \langle x_{k_1}\cdots x_{k_{2N}}\rangle &=& \left.\frac{\partial}{\partial J_{k_1}} \cdots \frac{\partial}{\partial J_{k_{2N}}} \exp \left(\frac{1}{2} J^\mathrm{T}A^{-1}J \right) \right|_{J=0} \\ &=& \frac{\partial}{\partial J_{k_1}} \cdots \frac{\partial}{\partial J_{k_{2N}}} \frac{1}{N!} \left(\frac{1}{2} J^\mathrm{T}A^{-1}J\right)^{N} \\ &=& \frac{1}{2^N N!} \frac{\partial}{\partial J_{k_1}} \cdots \frac{\partial}{\partial J_{k_{2N}}} \left(J^\mathrm{T}A^{-1}J\right)^{N}. \end{eqnarray}$$ Note that the derivative $\partial/\partial J_{k_1}$ will operate on all possible $J$s. Likewise for the other derivatives. Thus, we will get a sum over all $(2N)!$ possible permutations of the $k_i$. These permutations are denoted by $\sigma$ and they live in the symmetric group $S_{2N}$. Thus, we arrive at the result, known as Wick's theorem. (If this seems too vague, it is straightforward to find the result by induction on $N$. We have the formula for $N=1$ above.)

Let's unwind the scattering amplitude for $N=2$. We have $$\langle x_{k_1}x_{k_2}x_{k_3}x_{k_4}\rangle = \frac{1}{2^2 2!} \sum_{\sigma \in S_{4}}(A^{-1})_{k_{\sigma(1)}k_{\sigma(2)}} (A^{-1})_{k_{\sigma(3)}k_{\sigma(4)}}.$$ There are $4! = 24$ permutations in $S_4$ and thus $24$ terms in the sum. For example $(12)$ takes $1$ to $2$ and $2$ to $1$. Not all permutations give independent terms. In fact, the degeneracy is $2^N N!$, since $A^{-1}$ is symmetric and the order of the $A^{-1}$s doesn't matter. Thus, there will only be $$\frac{(2N)!}{2^N N!} = (2N-1)!!$$ independent terms. For $N=2$ there are three independent terms, $$\langle x_{k_1}x_{k_2}x_{k_3}x_{k_4}\rangle = A^{-1}_{k_1 k_2}A^{-1}_{k_3 k_4} + A^{-1}_{k_1 k_3}A^{-1}_{k_2 k_4} + A^{-1}_{k_1 k_4}A^{-1}_{k_2 k_3}.$$

II. Central identity

Consider $$I_J = \frac{1}{Z_0} \int d^n x \exp\left(-\frac{1}{2}x^\mathrm{T}A x + J^\mathrm{T}x \right) f(x).$$ We are interested in $I_0$. The presence of the source allows us to take $f$ out of the integral if we replace its argument with $\partial/\partial J$, $$\begin{eqnarray} I_0 &=& \left.f(\partial_J) \frac{1}{Z_0} \int d^n x \exp\left(-\frac{1}{2}x^\mathrm{T}A x + J^\mathrm{T}x \right)\right|_{J=0} \\ &=& \left.f(\partial_J) \exp\left(\frac{1}{2} J^\mathrm{T}A^{-1}J\right)\right|_{J=0}. \end{eqnarray}$$ This is a typical trick. In fact, it is equivalent to what Anthony Zee calls the "central identity of quantum field theory." Usually $f(x) = \exp[-V(x)]$, where $V(x)$ is the potential. There is a nice graphical interpretation of the formula in the form of Feynman diagrams. The process of calculating $$\left.e^{-V(\partial_J)} \exp\left(\frac{1}{2} J^\mathrm{T}A^{-1}J\right)\right|_{J=0}$$ is equivalent to "tying together" the propagators (the $A^{-1}$) with vertices represented by the operator $-V(\partial_J)$. Of course, there is a lot more to it than that!

Now we wish to show $$I_0 = \left.\exp\left(\frac{1}{2}\partial_x^\mathrm{T} A^{-1}\partial_x\right) f(x)\right|_{x=0}.$$ If we consider a Taylor expansion for $f(\partial_J)$ and $f(x)$ we can see this is equivalent to $$\left.\partial_{J_{k_1}}\cdots \partial_{J_{k_{2N}}} \exp\left(\frac{1}{2} J^\mathrm{T}A^{-1}J\right)\right|_{J=0} = \left.\exp\left(\frac{1}{2}\partial_x^\mathrm{T} A^{-1}\partial_x\right) x_{k_1}\cdots x_{k_{2N}}\right|_{x=0}.$$ The left hand side is $\langle x_{k_1}\cdots x_{k_{2N}}\rangle$, by previous arguments. But $$\begin{eqnarray} \left.\exp\left(\frac{1}{2}\partial_x^\mathrm{T} A^{-1}\partial_x\right) x_{k_1}\cdots x_{k_{2N}}\right|_{x=0} &=& \frac{1}{2^N N!} \left(\partial_x^\mathrm{T} A^{-1}\partial_x\right)^N x_{k_1}\cdots x_{k_{2N}} \\ &=& \frac{1}{2^N N!} \sum_{\sigma \in S_{2N}}(A^{-1})_{k_{\sigma(1)}k_{\sigma(2)}} \cdots (A^{-1})_{k_{\sigma(2N-1)}k_{\sigma(2N)}} \\ &=& \langle x_{k_1}\cdots x_{k_{2N}}\rangle. \end{eqnarray}$$ For example, for $N=1$, $$\begin{eqnarray} \left.\exp\left(\frac{1}{2}\partial_x^\mathrm{T} A^{-1}\partial_x\right) x_{k_1} x_{k_{2}}\right|_{x=0} &=& \left(\frac{1}{2} \partial_x^\mathrm{T} A^{-1}\partial_x\right) x_{k_1} x_{k_2} \\ &=& \frac{1}{2} A^{-1}_{ij}(\delta_{i k_1}\delta_{j k_2} + \delta_{i k_2}\delta_{j k_1}) \\ &=& \frac{1}{2} (A^{-1}_{k_1 k_2} + A^{-1}_{k_2 k_1}) \\ &=& A^{-1}_{k_1 k_2}. \end{eqnarray}$$


Many texts on quantum field theory deal with such finite dimensional integrals on the way to treating the infinite dimensional case. See, for example,

A. Zee. Quantum field theory in a nutshell

J. Zinn-Justin. Quantum Field Theory and Critical Phenomena

  • 18,473
  • 2
  • 39
  • 92

I will complement user26872's fantastic answer with an induction proof of $$ \langle x_{k_1}\cdots x_{k_{2N}}\rangle = \frac{1}{2^N N!} \sum_{\sigma \in S_{2N}}A^{-1}_{k_{\sigma(1)},k_{\sigma(2)}} \cdots A^{-1}_{k_{\sigma(2N-1)},k_{\sigma(2N)}}. $$

First, define $$ \langle x_{1}\cdots x_{2N}\rangle = \int d^n x \, x_1 \cdots x_{2N} \exp\left(-\frac12 x^T A x \right) $$ As shown by user26872, we can express the integral by adding a source term and completing the square, giving $$ \langle x_1 \cdots x_{2N} \rangle = \left. \frac{\partial}{\partial J_1} \cdots \frac{\partial}{\partial J_{2N}} \exp\left(\frac{1}{2} J^T A^{-1} J\right) \right|_{J=0}. $$ Let us use this differentiation formula to compute some base cases by hand: $$ \begin{align} \langle 1 \rangle &= 1 &(N=0) \\ \langle x_{1}x_{2} \rangle &= A^{-1}_{1,2} &(N=1) \\ \langle x_{1}x_{2}x_{3}x_{4}\rangle &= A^{-1}_{1,2}A^{-1}_{3,4} + A^{-1}_{1,3}A^{-1}_{2,4} + A^{-1}_{1,4}A^{-1}_{2,3} &(N=2)&. \end{align} $$ The pattern seems to be a sum over products of pairings. We can express it neatly as $$ \langle x_{1}\cdots x_{2N}\rangle = \frac{1}{2^N N!} \sum_{\sigma \in S_{2N}}A^{-1}_{\sigma(1),\sigma(2)} \cdots A^{-1}_{\sigma(2N-1),\sigma(2N)}, $$ where $\sum_{\sigma \in S_{2N}}$ is the sum over all permutations of $1, \dots, 2N$ and $\sigma(i)$ is the $i$-th element of the permutation $\sigma$ (for example, if $\sigma = 3142$, then $\sigma(1) = 3$). The prefactors $1/2^N$ and $1/N!$ account for the symmetry of $A_{i,j} = A_{j,i}$ and the arbitrary order of $A^{-1}$-factors in the product, so the overall sum is stripped of any prefactor, as in the base cases.

Let us prove this with induction. Perform the two final differentiations to obtain $$ \langle x_1 \cdots x_{2N} \rangle = \left. \partial_{1} \cdots \partial_{2N-2} \left( A^{-1}_{2N-1,2N} + \sum_{i=1}^{2N} \sum_{j=1}^{2N} J_{i} J_{j} A^{-1}_{2N-1,i} A^{-1}_{2N,j} \right) \exp\left(\frac{1}{2} J^T A^{-1} J \right) \right|_{J=0}. $$

The first term is already ready for induction with $N-1$. To use induction on the second term, we should first move the $J$-s to the left of the $\partial$-s with the product rule (or the commutator $[\partial_i, J_j] = \partial_{i} J_{j} - J_j \partial_i = \delta_{ij}$, if you will):

$$ \partial_1 \cdots \partial_{2N-2} J_i J_i F(J) = \begin{cases} \partial_1 \cdots (2 J_i + J_i ^2 \partial_i) \cdots \partial_{2N-2} F(J) & (i=j) \\ \partial_1 \cdots (1 + J_i \partial_i) \cdots (1 + J_j \partial_j) \cdots \partial_{2N-2} F(J) & (i \neq j) \end{cases} $$ The only term that survives $J=0$ is the first term in the expansion of the $i \neq j$ case. We then get $$ \begin{align} \left \langle x_1 \cdots x_{2N} \right \rangle &= \left. A^{-1}_{2N-1,2N} \partial_1 \cdots \partial_{2N-2} \exp\left(\frac{1}{2} J^T A^{-1} T \right) \right|_{J=0} \\ &+ \sum_{i=1}^{2N-2} \sum_{j=1\\j \neq i}^{2N-2} \left. A^{-1}_{2N-1,i} A^{-1}_{2N,j} \partial_1 \cdots \hat{\partial}_i \cdots \hat{\partial}_j \cdots \partial_{2N-2} \exp\left(\frac{1}{2} J^T A^{-1} J \right) \right|_{J=0}, \end{align} $$ where $\partial_1 \cdots \hat{\partial}_i \cdots \partial_{2N-2}$ means "the product $\partial_1 \cdots \partial_i \cdots \partial_{2N-2}$ without the factor $\partial_i$" (think of $\hat{\phantom{x}}$ as a "throw-out-operator"). We can now use induction with $N-1$ on the first term and $N-2$ on the second term, so

$$ \left \langle x_1 \cdots x_{2N} \right \rangle = A^{-1}_{2N-1,2N} \left \langle x_1 \cdots x_{2N-2} \right \rangle + \sum_{i=1}^{2N-2} \sum_{j=1\\j \neq i}^{2N-2} A^{-1}_{2N-1,i} A^{-1}_{2N,j} \left \langle x_1 \cdots \hat{x}_i \cdots \hat{x}_j \cdots x_{2N-2} \right \rangle. $$

Notice how the first term pairs only pairs the "new" indices with each other, while the second term pairs them with the "old" indices. Together, they pair all indices - new and old - in all possible ways. Looking back at our induction hypothesis for $\langle x_1 \cdots x_{2N} \rangle$, we realise that we can extend the two terms from sums over the symmetric groups $S_{2N-2}$ and $S_{2N-4}$ to one sum over the symmetric group $S_{2N}$.

  • The first term is unchanged under the $N$ placements of the factor $A^{-1}_{2N-1,2N}$ among the other $A^{-1}$-factors and under the $2$ index changes $2N-1 \leftrightarrow 2N$. This produces a factor $1/2N$ that combines with $1/2^{N-1} (N-1)!$ to an overall factor $1/2^N N!$.
  • The second term is unchanged under the $N(N-1)$ placements of the factors $A^{-1}_{2N-1,i}$ among the other $A^{-1}$-factors and $A^{-1}_{2N,j}$ and under the $2^2$ index changes $2N-1 \leftrightarrow i$ and $2N \leftrightarrow j$. This produces a factor $1/2^2 N(N-1)$ that combines with $1/2^{N-2} (N-2)!$ to an overall factor $1/2^N N!$.

Finally, then, we can conclude that $$ \begin{align} \langle x_{1}\cdots x_{2N}\rangle &= \frac{1}{2^N N!} \left( \underbrace{\sum_{\sigma \in S_{2N}}A^{-1}_{\sigma(1),\sigma(2)} \cdots A^{-1}_{\sigma(2N-1),\sigma(2N)}}_\text{constrained to new-new pairs} + \underbrace{\sum_{\sigma \in S_{2N}}A^{-1}_{\sigma(1),\sigma(2)} \cdots A^{-1}_{\sigma(2N-1),\sigma(2N)}}_\text{constrained to not-new-new pairs} \right) \\ &= \frac{1}{2^N N!} \sum_{\sigma \in S_{2N}}A^{-1}_{\sigma(1),\sigma(2)} \cdots A^{-1}_{\sigma(2N-1),\sigma(2N)}. \end{align} $$

Here, we used $1, \dots, 2N$ as to concisely label the $x_i$ variables. There is nothing in the proof that changes if we index the indices instead ($x_i \rightarrow x_{k_i}$) so we also have the more general result $$ \langle x_{k_1}\cdots x_{k_{2N}}\rangle = \frac{1}{2^N N!} \sum_{\sigma \in S_{2N}}A^{-1}_{k_{\sigma(1)},k_{\sigma(2)}} \cdots A^{-1}_{k_{\sigma(2N-1)},k_{\sigma(2N)}}. $$

  • 31
  • 1

This is my first time answering on stack exchange.

Proof of : $\int f(\vec x) \, \exp\left( - \frac 1 2 \sum_{i,j=1}^{n}A_{ij} x_i x_j \right) d^nx = \sqrt{(2\pi)^n\over \det A} \, \left. \exp\left({1\over 2}\sum_{i,j=1}^{n}(A^{-1})_{ij}{\partial \over \partial x_i}{\partial \over \partial x_j}\right)f(\vec{x})\right|_{\vec{x}=0}$

I found a silly simple proof that actually gives a generalization of the above result.

To make it all the more amusing I will give a Gauss style proof where, as Niels Abel cleverly said, like the fox, I will efface the tracks in the sand with my tail.

Consider a probability density $\mu$ that takes as input $\textbf{x}=(x_1,x_2,...,x_n)$ and define:

$$F(\textbf{y})=\int{\mu(\textbf{x})f(\textbf{x}+\textbf{y})d^n\textbf{x}}=\langle f(\textbf{x}+\textbf{y})\rangle,$$

where $\langle ... \rangle$ represents the average with respect to $\mu$. In the original problem, $\mu$ is Gaussian.

Next, we use the fact that the generator of translations $\textbf{x}\rightarrow\textbf{x}+\textbf{a}$ is given by

$$ P_\textbf{a}=\textbf{a}\cdot\nabla $$

where I used $P$ like in quantum mechanics for that extra mystique. By the definiton of what a generator is, this implies that


which is just a fancy way of writing an all order Taylor expansion.

Now the magical part,

$$\int{\mu(\textbf{x})f(\textbf{x}+\textbf{y})d^n\textbf{x}}=\int{\mu(\textbf{x})e^{\textbf{x}\cdot\nabla_y}f(\textbf{y})d^n\textbf{x}}=\langle e^{\textbf{x}\cdot\nabla_y}\rangle f(\textbf{y})=K(\nabla_\textbf{y})f(\textbf{y}),$$

where K is by definition the moment generating function.

Using the formula for the moment generating function of a Gaussian and setting $\textbf{y}=0$ gives the formula above.

Proof of : $$ \frac{1}{2^N N!} \left(\partial_x^\mathrm{T} A^{-1}\partial_x\right)^N x_{k_1}\cdots x_{k_{2N}} = \frac{1}{2^N N!} \sum_{\sigma \in S_{2N}}(A^{-1})_{k_{\sigma(1)}k_{\sigma(2)}} \cdots (A^{-1})_{k_{\sigma(2N-1)}k_{\sigma(2N)}} $$

The following is independent from the proof above and simply provides extra steps in the derivation of @user26872.

We start by writing explicitly the left hand side as a product of sums:

$$\left(\partial_x^\mathrm{T} A^{-1}\partial_x\right)^N= \left(\sum_{i,j}A^{-1}_{i,j}\partial_{x_i}\partial_{x_j}\right)^N$$

Next, we consider the following notation:

  • The set of integers from 1 to N: $\{1,...,N\}=[N]$
  • The set of all functions from set $A$ to set $B$: $A\rightarrow B$

The formula for a product of sums may then be written as (explanation of formula below):

$$\left(b_{1,1}+b_{1,2}+\ldots + b_{1,M}\right)\ldots\left(b_{N,1}+\ldots + b_{N,M}\right)=\prod^N_{k=1} \sum^M_{r=1} b_{k,r}=\sum_{f \, \in \, [N]\rightarrow[M]}\prod^N_k b_{k,f(k)}$$

In words, for each factor in the product (so for each index $k$), pick one term in that sum at random (so one random $\hat r$ which we call $\hat r=f(k)$) and then multiply all of the collected terms (so $\prod_k b_{k,f(k)}$). Finally, to get all of the terms of the expansion, sum over all configurations (so over all $f$).

In the case of a double sum we have:

$$\prod^N_{k=1} \sum^{M_1}_{r=1}\sum^{M_2}_{s=1} b_{k,r,s}=\sum_{\begin{gather} f_1 \, \in \, [N]\rightarrow [M_1] \\ f_2 \, \in \, [N]\rightarrow [M_2] \end{gather}}\prod^N_k b_{k,f_1(k),f_2(k)}$$

if $[M_1]=[M_2]=[M]$, then we may consider the change of notation $f_1(k)=f(1,k)$ and $f_2(k)=f(2,k)$ which leads to :

$$\prod^N_{k=1} \sum^{M}_{r=1}\sum^{M}_{s=1} b_{k,r,s}=\sum_{f \, \in \, [2]\mathrm{x}[N]\rightarrow [M]}\prod^N_k b_{k,f(1,\,k),f(2,\,k)}$$

To check that this last statement is true we show that both sums are over the same set.

Denoting Card as the cardinality of a set, recall that the cardinality of the set of all functions from set $A$ to set $B$ is $\textrm{Card}\left(B\right)^{\textrm{Card}\left(A\right)}$ and so

$$\mathrm{Card}\left[\left([N]\rightarrow[M]\right)\mathrm{x}\left([N]\rightarrow[M]\right)\right]=M^N M^N=\left(M^2\right)^N=M^{2 N}$$

Furthermore, $\mathrm{Card}\left([2]\textrm{x}[N]\right)=2 N$ such that


And so both sets have equal cardinality and both sums are equal.

Notice that the cardinality $M^{2N}$ leads to a third formulation of the sum as this is also the cardinality of the set $[2N]\rightarrow[M]$. We thus stack both copies of $[N]$, that is $\left(1,[N]\right)$ and $\left(2,[N]\right)$ and consider all functions from this new stacked set to the set $[M]$. We then arrive at the crucial formula of this post:

$$\prod^N_{k=1} \sum^M_{r=1}\sum^M_{s=1} b_{k,r,s}=\sum_{f \, \in \, [2N]\rightarrow [M]}\prod^N_{k=1} b_{k,f(k),f(k+N)}$$

For the original problem we have $b_{k,r,s}=A^{-1}_{r,s}\partial_{x_r}\partial_{x_s}$ which leads to:

$$\left(\sum_{i,j}A^{-1}_{i,j}\partial_{x_i}\partial_{x_j}\right)^N=\sum_{f \, \in \, [2N]\rightarrow [M]}\prod^N_{k=1} A^{-1}_{f(k),f(k+N)}\partial_{x_{f(k)}}\partial_{x_{f(k+N)}}$$

Reordering the partial derivatives and applying this operator on the product $ x_{k_1} \cdots x_{k_{2N}} $ then leads to

$$\left(\sum_{i,j}A^{-1}_{i,j}\partial_{x_i}\partial_{x_j}\right)^N\, x_{k_1} \cdots x_{k_{2N}}=\sum_{f \, \in \, [2N]\rightarrow [M]}\prod^N_{k=1} A^{-1}_{f(k),f(k+N)}\prod^{2N}_{k=1}\partial_{x_{f(k)}}\; x_{k_1} \cdots x_{k_{2N}}$$

As $\partial_{x_i} x_j=\delta_{i,j}$, the only non zero terms of the sum are those for which

$$\left\{ f(1),...,f(2N) \right\}=\left\{ k_1,..., k_{2N} \right\}$$

so that $f$ is a bijection from the set $[2N]$ to $\left\{k_1,..., k_{2N} \right\}$. Hence, for every $f$ there exist a unique $\sigma\in S_{2N}$ such that $f(i)=k_{\sigma(i)}$. The sum over $f$ may then be replaced by a sum over $\sigma$ which leads to

$$\left(\partial_x^\mathrm{T} A^{-1}\partial_x\right)^N x_{k_1}\cdots x_{k_{2N}} = \sum_{\sigma \in S_{2N}}(A^{-1})_{k_{\sigma(1)}k_{\sigma(N+1)}} \cdots (A^{-1})_{k_{\sigma(N)}k_{\sigma(2N)}}$$

Finally, for any $\alpha\in S_{2N}$, $\sum_{\sigma}=\sum_{\sigma\circ\alpha}$, where $\circ$ represents composition, as this is simply a rearrangement of terms. Taking $\alpha$ such that


leads to the desired result.