Questions tagged [matrix-calculus]

Matrix calculus is about doing calculus, especially derivative and infinite series over spaces of vectors and matrices.

Matrix calculus studies derivatives and differentials of scalar, vector and matrix with respect to vector and matrix. It has been widely applied into different areas such as machine learning, numerical analysis, economics etc.

There are basically two methods.

  • Direct: Regard vectors and matrices as scalar so as to compute in the usual way in calculus. And The Matrix Cookbook provides a lot of basic facts.

  • Component-wise: Write everything in indices notation and compute in the usual way componentwisely. Einstein summation convention is frequently used.

3392 questions
12
votes
2 answers

Calculate the Hessian of a Vector Function

I'm working with optimisation. I am trying to obtain the hessian of a vector function: $$ \mathbf{F(X) = 0} \quad \text{or} \quad \begin{cases} f_1(x_1,x_2,\dotsc,x_n) = 0,\\ f_2(x_1,x_2,\dotsc,x_n) = 0,\\ \vdots\\ …
12
votes
2 answers

How should I study The Matrix Cookbook?

I use The Matrix Cookbook by Kaare Brandt Petersen and Michael Syskind Pedersen to solve many problems (mostly matrix derivatives). In most cases, I just map the problem to one of the formula and solve it but I cannot derive the formula by myself…
12
votes
1 answer

Smoothness of $O(n)$-equivariant maps of positive-definite matrices

$\def\sp{\mathrm{Sym}^+}$Let $\sp \subset GL(n,\mathbb R)$ denote the manifold of positive-definite symmetric $n \times n$ matrices. I am interested in functions $A : \sp \to \sp$ that are equivariant under the natural conjugation action of $O(n)$;…
12
votes
2 answers

Counterexample to the chain-rule

I made the following observation Let $f(t):=\left(\begin{matrix} 0 &e^{it} \\ e^{-it} & 0 \end{matrix}\right),$ then $f(t)^2= \operatorname{id}$. Thus, we have $\frac{d}{dt}f(t)^2= \frac{d}{dt}\operatorname{id}=0.$ On the other hand yields the…
12
votes
5 answers

Proof for the funky trace derivative : $d (\operatorname{trace} (ABA'C))$?

Given three matrices $A$, $B$ and $C$ such that $ABA^T C$ is a square matrix, the derivative of the trace with respect to $A$ is: $$ \nabla_A \operatorname{trace}( ABA^{T}C ) = CAB + C^T AB^T $$ There is a proof here, page 4 (PDF file). However, I…
Harold
  • 187
  • 1
  • 6
11
votes
2 answers

Prove that $e^{t(X+Y)}=e^{tX} e^{tY}$ implies $[X,Y]=0$

I am currently reading about the Baker-Campbell-Hausdorff formula and in a textbook on Lie Algebras, he shows that if $$[X,[X,Y]] = 0 \quad \text{ and } [Y,[X,Y]] = 0$$ then $$e^{Xt}e^{Yt} = e^{Xt + Yt + \frac{t^{2}}{2}[X,Y]}.$$ where $[X,Y] =…
JessicaK
  • 7,261
  • 5
  • 22
  • 40
11
votes
2 answers

Explicit proof of the derivative of a matrix logarithm

Firstly, I'm but a mere physicist, so please be gentle :-) I want to explicitly show that the derivative of the (natural) logaritm of a general $n \times n$ (diagonalizable) matrix $X(x)$ w.r.t. $x$…
Wouter
  • 368
  • 2
  • 3
  • 12
11
votes
4 answers

If $A^2=2A$, then $A$ is diagonalizable.

I think, I should use a double linear transformation but can't find any proper solution. Let $\mathbb F$ be a field, $\mathscr M_n (\mathbb F)$, the set of $n\times n$ matrices with elements in $\mathbb F$, and $A\in \mathscr M_n (\mathbb F)$…
11
votes
3 answers

What is the derivative of $\log \det X$ when $X$ is symmetric?

According to Appendix A.4.1 of Boyd & Vandenberghe's Convex Optimization, the gradient of $f(X):=\log \det X$ is $$\nabla f(X) = X^{-1}$$ The domain of the $f$ here is the set of symmetric matrices $\mathbf S^n$. However, according to the book…
11
votes
2 answers

Derivative of quadratic matrix form with respect to the matrix

Suppose we have the following quadratic form $$ f(M)=x^TMx $$ where $f: \mathbb{R}^{n \times m} \rightarrow \mathbb{R}$, and $x \in \mathbb{R}^n$. As it is obvious, this function is linear in $M$. What is the derivative of $f(M)$ with respect to…
Saeed
  • 4,031
  • 1
  • 8
  • 23
11
votes
1 answer

Prove that $\nabla_{\mathrm X} \mbox{tr} (\mathrm A \mathrm X^{-1} \mathrm B) = - \mathrm X^{-\top} \mathrm A^\top \mathrm B^\top \mathrm X^{-\top}$

Prove that $$\nabla_{\mathrm X} \mbox{tr} (\mathrm A \mathrm X^{-1} \mathrm B) = - \mathrm X^{-\top} \mathrm A^\top \mathrm B^\top \mathrm X^{-\top}$$ My proof is below. I am interested in other proofs. My proof Let $$f (\mathrm X) := \mbox{tr}…
11
votes
2 answers

generic rule matrix differentiation (Hadamard Product, element-wise)

I struggle with taking the derivative of the Hadamard-Product? Let us consider $f(x)=x^TAx=x^T(Ax)$. We know $$\frac{\partial}{\partial x} x^TAx = (A+A^T)x.$$ The Matrix-Cookbook claimed $d(XY)=d(X)Y+Xd(Y)$ and $$\frac{\partial}{\partial x} x^Ta =…
10
votes
4 answers

Derivative of spectral norm of symmetric matrix

I want to calculate the derivative of the spectral norm of a symmetric square matrix $W$: $$ \frac{\partial}{\partial w_{ij}} \|W\|_2 $$ How should I go about this?
10
votes
5 answers

Prove that if $\operatorname{rank}A=n$, then $\operatorname{rank}AB=\operatorname{rank}B$

Let $A \in M_{m\times n}(\mathbb{R})$ and $B \in M_{n\times p}(\mathbb{R})$. Prove that if $\operatorname{rank}(A)=n$ then $\operatorname{rank}(AB)=\operatorname{rank}(B)$. I tried to start with definitions finding that $n \le m$, but didn't know…
Galc127
  • 4,353
  • 18
  • 47
10
votes
4 answers

Derivative of matrix involving trace and log

I'm stuck on this problem. Let $X\in\mathbb{R}^{n\times n}$, compute the following matrix derivatives $$\frac{\partial}{\partial X}\mathrm{tr}(\log(XA)\log(XA)^\top),$$ $$\frac{\partial}{\partial X}\mathrm{tr}(B\log(XA)), $$ where $\log(\cdot)$ is…
Ludwig
  • 2,117
  • 11
  • 22