59

In a scientific paper, I've seen the following

$$\frac{\delta K^{-1}}{\delta p} = -K^{-1}\frac{\delta K}{\delta p}K^{-1}$$

where $K$ is a $n \times n$ matrix that depends on $p$. In my calculations I would have done the following

$$\frac{\delta K^{-1}}{\delta p} = -K^{-2}\frac{\delta K}{\delta p}=-K^{-T}K^{-1}\frac{\delta K}{\delta p}$$

Is my calculation wrong?

Note: I think $K$ is symmetric.

Rodrigo de Azevedo
  • 18,977
  • 5
  • 36
  • 95
Sara
  • 797
  • 1
  • 7
  • 8

4 Answers4

124

The major trouble in matrix calculus is that the things are no longer commuting, but one tends to use formulae from the scalar function calculus like $(x(t)^{-1})'=-x(t)^{-2}x'(t)$ replacing $x$ with the matrix $K$. One has to be more careful here and pay attention to the order. The easiest way to get the derivative of the inverse is to derivate the identity $I=KK^{-1}$ respecting the order $$ \underbrace{(I)'}_{=0}=(KK^{-1})'=K'K^{-1}+K(K^{-1})'. $$ Solving this equation with respect to $(K^{-1})'$ (again paying attention to the order (!)) will give $$ K(K^{-1})'=-K'K^{-1}\qquad\Rightarrow\qquad (K^{-1})'=-K^{-1}K'K^{-1}. $$

A.Γ.
  • 28,298
  • 4
  • 42
  • 76
19

Yes, your calculation is wrong, note that $K$ may not commute with $\frac{\partial K}{\partial p}$, hence you must apply the chain rule correctly. The derivative of $\def\inv{\mathrm{inv}}\inv \colon \def\G{\mathord{\rm GL}}\G_n \to \G_n$ is not given by $\inv'(A)B = -A^2B$, but by $\inv'(A)B = -A^{-1}BA^{-1}$. To see that, note that for small enough $B$ we have \begin{align*} \inv(A + B) &= (A + B)^{-1}\\ &= (\def\I{\mathord{\rm Id}}\I + A^{-1}B)^{-1}A^{-1}\\ &= \sum_k (-1)^k (A^{-1}B)^kA^{-1}\\ &= A^{-1} - A^{-1}BA^{-1} + o(\|B\|) \end{align*} Hence, $\inv'(A)B= -A^{-1}BA^{-1}$, and therefore, by the chain rule $$ \partial_p (\inv \circ K) = \inv'\circ K\bigl(\partial_p K) = -K^{-1}(\partial_p K) K^{-1} $$

martini
  • 80,922
  • 5
  • 88
  • 127
  • How does the second line follow from B being small enough? – nbubis Apr 30 '18 at 13:19
  • 3
    @nbubis The Neumann series is $I - A = \sum_{k = 0}^{\infty} A^k$ for $\| A \| < 1$, which is analogous to the geometric series. Thus for small enough $B$ we have $\| A^{-1} B \| < 1$ and thus $(I + A^{-1} B)^{-1} = \sum_{k = 0}^{\infty} (-1)^k (A^{-k} B)^k$. – Ramanujan May 28 '20 at 22:34
9

Actually, we can directly calculate the derivate of a matrix starting from the definition of the derivate of functions. In particular, \begin{align} \frac{dK^{-1}}{dp} & =\lim_{\Delta p \to 0} \frac{(K+\Delta K)^{-1} - K^{-1}}{\Delta p} \\ {} & = \lim_{\Delta p \to 0} \frac{(K+\Delta K)^{-1}KK^{-1} - (K+\Delta K)^{-1}(K+\Delta K)K^{-1}}{\Delta p} \\ {} & = \lim_{\Delta p \to 0} \frac{(K+\Delta K)^{-1}(-\Delta K) K^{-1}}{\Delta p} \\ {} & = - K^{-1} \lim_{\Delta p \to 0} \frac{\Delta K}{\Delta p} K^{-1} \\ {} & = - K^{-1} (\partial_{p} K) K^{-1} \end{align}

cyrie wang
  • 91
  • 1
  • 3
1

Another related method is to use differentials. $$ d\mathbf{K}^{-1}= -\mathbf{K}^{-1} (d\mathbf{K}) \mathbf{K}^{-1} $$ and $d\mathbf{K}= \frac{\partial \mathbf{K}}{\partial p} dp$ Thus $$ d\mathbf{K}^{-1}= -\left[\mathbf{K}^{-1} \frac{\partial \mathbf{K}}{\partial p} \mathbf{K}^{-1} \right] dp $$ from which it follows that $$ \frac{\partial \mathbf{K}^{-1}}{\partial p} = -\mathbf{K}^{-1} \frac{\partial \mathbf{K}}{\partial p} \mathbf{K}^{-1} $$

Steph
  • 1,914
  • 1
  • 2
  • 6