**The problem:**

Let $S \in \mathbb{C}^{N\times M}$ with $N > M$ and $S^{H}S=\mathbb{I}$, let $\rho$ and $\sigma$ be hermitian matrices of trace $1$ and define the function $D: \mathbb{C}^{N\times M} \rightarrow \mathbb{R}$ as:

$$D(S) = \text{tr}(|S\rho S^{H} - \sigma|),$$

with $|A-B| = (A-B)(A-B)^{H}$ and $^H$ denoting the hermitian transpose, i.e., $D$ is the trace distance. My goal is to compute $\nabla_S D(S)$, the gradient of $D$ w.r.t $S$.

**My approach:**

I defined the following variables:

$$A = S\rho S - \sigma$$ $$B = A^H A.$$

$D$ then becomes:

$$D = tr(B^{1/2})$$

The goal is now to take the differential of $D$ and rearrange terms to eventually arrive at something like:

$$dD = \text{tr} (K dS),$$

with the transpose of $K$, $K^T$, being the gradient we're looking for.

My progress so far:

$$dD = d(\text{tr}(B^{1/2}) = \text{tr}(d(B^{1/2}))$$ $$dD = \frac{1}{2}\text{tr}((B^{-1/2})^T dB)$$

We have:

$$dB = (dA)^HA + A^HdA$$

And:

$$dA = dS\rho S^H + S\rho (dS)^H$$

I will now get terms with $dS$ and terms with $(dS)^H$ and I'm not sure how to manipulate them to get to an expression from which I can read out the gradient. Is this even the (or a) right approach?