31

How to calculate the gradient with respect to $X$ of: $$ \log \mathrm{det}\, X^{-1} $$ here $X$ is a positive definite matrix, and det is the determinant of a matrix.

How to calculate this? Or what's the result? Thanks!

Rodrigo de Azevedo
  • 18,977
  • 5
  • 36
  • 95
pluskid
  • 1,059
  • 2
  • 9
  • 16
  • 5
    Note that $\log\det\mathbf X^{-1}=\log\frac1{\det\mathbf X}=-\log\det\mathbf X$... – J. M. ain't a mathematician May 12 '11 at 14:48
  • 6
    And note that $\log \det X =\text{tr} \log X$... – Fabian May 12 '11 at 14:50
  • Somehow I wonder if what you actually need is the Gâteaux or the Fréchet derivative... where did you encounter this, and what are you *actually* doing? – J. M. ain't a mathematician May 12 '11 at 15:00
  • I encounter this when deriving a lower bound of D-optimal experimental design using dual theory (an exercise of _Convex Optimization_). I want to find the optimal of a function which involves $\log\mathrm{det}\,(X^{-1})$. – pluskid May 12 '11 at 15:25
  • A closely related question and answer, worth a cross-reference: [How to calculate the derivative of log det matrix?](https://math.stackexchange.com/questions/1151569/how-to-calculate-the-derivative-of-log-det-matrix/1151578) but the question is framed in the context of Matrix Calculus – Sohail Si Oct 30 '17 at 15:14
  • @Fabian why $\log \det X=tr \log X$? $\log \det X=\log \lambda_1+\cdots+\log \lambda_n$, while $tr \log X=\log X_{11}+\cdots+\log X_{nn}$, where $\lambda$ is an eigenvalue of $X$. – Lee Dec 18 '18 at 09:43
  • @J.M.isn'tamathematician relevant? [Prove $\frac{\partial \rm{ln}|X|}{\partial X} = 2X^{-1} - \rm{diag}(X^{-1})$.](https://math.stackexchange.com/questions/1493137). Here I say 'We first note that for the case where the elements of X are independent, a constructive proof involving cofactor expansion and adjoint matrices can be made to show that $\frac{\partial ln|X|}{\partial X} = X^{-T}$ (Harville). This is not always equal to $2X^{-1}-diag(X^{-1})$. The fact alone that X is positive definite is sufficient to conclude that X is symmetric and thus its elements are not independent.' – BCLC Apr 16 '21 at 10:03
  • @Fabian relevant? [Prove $\frac{\partial \rm{ln}|X|}{\partial X} = 2X^{-1} - \rm{diag}(X^{-1})$.](https://math.stackexchange.com/questions/1493137). Here I say 'We first note that for the case where the elements of X are independent, a constructive proof involving cofactor expansion and adjoint matrices can be made to show that $\frac{\partial ln|X|}{\partial X} = X^{-T}$ (Harville). This is not always equal to $2X^{-1}-diag(X^{-1})$. The fact alone that X is positive definite is sufficient to conclude that X is symmetric and thus its elements are not independent.' – BCLC Apr 16 '21 at 10:03

4 Answers4

42

I assume that you are asking for the derivative with respect to the elements of the matrix. In this cases first notice that

$$\log \det X^{-1} = \log (\det X)^{-1} = -\log \det X$$

and thus

$$\frac{\partial}{\partial X_{ij}} \log \det X^{-1} = -\frac{\partial}{\partial X_{ij}} \log \det X = - \frac{1}{\det X} \frac{\partial \det X}{\partial X_{ij}} = - \frac{1}{\det X} \mathrm{adj}(X)_{ji} = - (X^{-1})_{ji}$$

since $\mathrm{adj}(X) = \det(X) X^{-1}$ for invertible matrices (where $\mathrm{adj}(X)$ is the adjugate of $X$, see http://en.wikipedia.org/wiki/Adjugate).

Matt W-D
  • 125
  • 1
  • 6
Chris Taylor
  • 27,485
  • 5
  • 79
  • 121
  • Thank you very much! This solved my problem! – pluskid May 12 '11 at 15:26
  • 1
    The $\frac{\partial \det X}{\partial X_{ij}} = \mathrm{adj}(X)_{ji}$ was very non-obvious to me, but can be worked out using the [Jacobi formula](https://en.wikipedia.org/wiki/Jacobi's_formula). – ntc2 Jan 05 '17 at 03:58
  • relevant? [Prove $\frac{\partial \rm{ln}|X|}{\partial X} = 2X^{-1} - \rm{diag}(X^{-1})$.](https://math.stackexchange.com/questions/1493137). Here I say 'We first note that for the case where the elements of X are independent, a constructive proof involving cofactor expansion and adjoint matrices can be made to show that $\frac{\partial ln|X|}{\partial X} = X^{-T}$ (Harville). This is not always equal to $2X^{-1}-diag(X^{-1})$. The fact alone that X is positive definite is sufficient to conclude that X is symmetric and thus its elements are not independent.' – BCLC Apr 16 '21 at 10:03
11

Or you can check section A.4.1 of the book Stephen Boyd, Lieven Vandenberghe, Convex Optimization for an alternative solution, where they compute the gradient without using the adjugate.

darij grinberg
  • 16,217
  • 4
  • 41
  • 85
yannis
  • 119
  • 1
  • 2
  • The formula in that book turned out to be wrong (for symmetric matrices); see [this question](https://math.stackexchange.com/questions/3667029/what-is-the-derivative-of-log-det-x-when-x-is-symmetric) and [this question](https://math.stackexchange.com/questions/3689627/taylor-expansion-of-a-function-of-a-symmetric-matrix) – evangelos May 26 '20 at 17:12
10

The simplest is probably to observe that $$-\log\det (X+tH) = -\log\det X -\log\det(I+tX^{-1}H) \\= -\log\det X - t \textrm{Tr}(X^{-1}H) + o(t),$$

where is used the "obvious" fact that $\det(I+A) = 1+\textrm{Tr}(A)+o(|A|)$ (all the other terms are quadratic expressions of the coefficients of $A$).

Notice that $\textrm{Tr}(X^{-1}H)=(X^{-T},H)$ in the Frobenius scalar product, hence $\nabla [-\log\det(X)] = -X^{-T}$ in this scalar product. (This gives another proof that $\nabla\det (X) = cof(X)$.)

Of course if $X$ is symmetric positive definite then $-X^{-1}$ is also a valid expression. Moreover, one has in this case, for $X,Y$ positive definite, $(-X^{-1}+Y^{-1},X-Y)\ge 0$.

user519964
  • 121
  • 1
  • 2
  • The last sentence is wrong because if $X$ is symmetric the derivative $\log \det X$ is $2 X^{-1} - \text{diag}(\text{diag}(X^{-1}))$ and not just $X^{-1}$. See [this question](https://math.stackexchange.com/questions/3667029/what-is-the-derivative-of-log-det-x-when-x-is-symmetric) – evangelos May 26 '20 at 17:16
  • @evangelos, you are wrong; the above answer is correct and is the best one. user519964 shows that $f(X+H)=f(X)-tr(X^{-1}H)+o(H)$; he only forgot to say that $H$ is in the tangent space of $S^{++}$, that is it is symmetric. Your formula (which is also correct) comes from (I believe) the matrix-cookbook which is a catastrophic book (for the students); obviously there is a factor $2$ because, when $i\not= j$, there are $2$ entries that are equal to $x_{i,j}$ but, practically, this form is difficult to use. –  May 26 '20 at 21:59
  • @loupblanc Firstly I didn't see the formula in the matrix cookbook. If you click the link in the question that I refer to you'll see where the formula is from, and if you look at the accepted answer, you'll see why $-X^{-1}$ can't be correct (just look at the $2x2$ case there) . The answers [here](https://math.stackexchange.com/questions/3689627/taylor-expansion-of-a-function-of-a-symmetric-matrix) further explain why first order approximation with trace as inner product can't guarantee a correct derivative when the matrix has the constraint of being symmetric – evangelos May 26 '20 at 22:36
  • @evangelos , you don't understand. The derivative is the linear function $H\mapsto -tr(X^{-1}H)$; you calculate a partial derivative, that here is a very bad idea. I want to help you but I spend my time; do as you want. –  May 26 '20 at 22:47
  • We may have a misunderstanding, I'm happy to be corrected if I'm wrong. I think the last sentence of the answer suggests that the derivative of $\log\det X^{-1}$ is $-X^{-1}$ if $X$ is symmetric. Do you agree that this is the meaning of the sentence? – evangelos May 26 '20 at 23:06
  • @evangelos relevant? [Prove $\frac{\partial \rm{ln}|X|}{\partial X} = 2X^{-1} - \rm{diag}(X^{-1})$.](https://math.stackexchange.com/questions/1493137). Here I say 'We first note that for the case where the elements of X are independent, a constructive proof involving cofactor expansion and adjoint matrices can be made to show that $\frac{\partial ln|X|}{\partial X} = X^{-T}$ (Harville). This is not always equal to $2X^{-1}-diag(X^{-1})$. The fact alone that X is positive definite is sufficient to conclude that X is symmetric and thus its elements are not independent.' – BCLC Apr 16 '21 at 10:04
1

Warning!

The answers given so far work only if $X \in \mathbb{R}^{n\times n}$ is not symmetric and has $n^2$ independent variables! If $X$ is symmetric, then it has only $n(n+1)/2$ independent variables and the correct formula is

$$\frac{\partial \log\det X^{-1}}{\partial X} = -\frac{\partial \log\det X}{\partial X} = -(2X^{-1}-\text{diag}(y_{11}, \dots, y_{nn})),$$

where $y_{ii}$ is the $i$ the entry on the diagonal of $X^{-1}$. This question explains why this is the case.

evangelos
  • 372
  • 2
  • 9
  • Did you try to compute the Hessian of the logdet for a symmetric matrix? – VanBaffo Jul 20 '20 at 13:59
  • Hessian of the logdet? Not really, I never needed to go beyond the gradient of logdet! But it shouldn't be very difficult since the gradient of logdet is not a very complicated expression. I also recall having read about it (in Harville's book -- Matrix Algebra from a Statistician's Perspective) – evangelos Jul 21 '20 at 01:14
  • I had a quick loog right now at page 313 of the book. It seems he does not consider the symmetric case for the computation of the Hessian. Because it computes the Hessian for non symmetric case, would it be legit to add the derivative also of the second term for form the "complete" Hessian? I.e. the derivative of the inverse matrix diagonal that arise in the first order derivative for symmetric matrices – VanBaffo Jul 21 '20 at 10:20
  • What you describe sounds right to me but I'm not a mathematician so don't take my word for it! I think your question sounds like a great question to be posted. If you do so, you can link to my questions (the one that I mention in this answer and this one: https://math.stackexchange.com/questions/3689627/taylor-expansion-of-a-function-of-a-symmetric-matrix), and while posting it you can share it with the users who answered those questions. – evangelos Jul 23 '20 at 23:47
  • relevant? [Prove $\frac{\partial \rm{ln}|X|}{\partial X} = 2X^{-1} - \rm{diag}(X^{-1})$.](https://math.stackexchange.com/questions/1493137). Here I say 'We first note that for the case where the elements of X are independent, a constructive proof involving cofactor expansion and adjoint matrices can be made to show that $\frac{\partial ln|X|}{\partial X} = X^{-T}$ (Harville). This is not always equal to $2X^{-1}-diag(X^{-1})$. The fact alone that X is positive definite is sufficient to conclude that X is symmetric and thus its elements are not independent.' – BCLC Apr 16 '21 at 10:03