2

In a lecture on PCA the lecturer claimed that (and I've also found it in this answer but it had 17 comments and I didn't want to make them 17 more)

instead of calculating the eigenvalue decomposition of $S := X X^T$, we compute the singular value decomposition of $X$, as this is computationally more stable, i.e. the Läuchli matrix $$X := \begin{pmatrix} 1 & 1 & 1 \\ \varepsilon & 0 & 0 \\ 0 & \varepsilon & 0 \\ 0 & 0 & \varepsilon \end{pmatrix}^{\top} = \begin{pmatrix} 1 & \varepsilon & 0 & 0 \\ 1 & 0 & \varepsilon & 0 \\ 1 & 0 & 0 & \varepsilon \end{pmatrix}.$$

Now I used WolframAlpha (I don't have access to Mathematica) to calculate the SVD of $X$: $$ \begin{pmatrix} -\frac{1}{\sqrt{2}} & -\frac{1}{\sqrt{6}} & \frac{1}{\sqrt{3}} \\ 0 & \sqrt{\frac{2}{3}} & \frac{1}{\sqrt{3}} \\ \frac{1}{\sqrt{2}} & - \frac{1}{\sqrt{6}} & \frac{1}{\sqrt{3}} \end{pmatrix} \begin{pmatrix} \varepsilon & 0 & 0 & 0\\ 0 & \varepsilon & 0 & 0\\ 0 & 0 & \sqrt{\varepsilon^2 + 3} & 0 \end{pmatrix} \begin{pmatrix} 0 & 0 & \frac{3}{\sqrt{3 \varepsilon^2 + 9}} & -\frac{\varepsilon}{\sqrt{\varepsilon^2 + 3}} \\ -\frac{\varepsilon}{\sqrt{2 \varepsilon}} & -\frac{\varepsilon}{\sqrt{6 \varepsilon}} & \frac{\varepsilon}{\sqrt{3 \varepsilon^2 + 9}} & \frac{1}{\sqrt{\varepsilon^2 + 3}} \\ 0 & \sqrt{\frac{2}{3}}& \frac{\varepsilon}{\sqrt{3 \varepsilon^2 + 9}} & \frac{1}{\sqrt{\varepsilon^2 + 3}} \\ \frac{\varepsilon}{\sqrt{2 \varepsilon}} & -\frac{\varepsilon}{\sqrt{6 \varepsilon}} & \frac{\varepsilon}{\sqrt{3 \varepsilon^2 + 9}} & \frac{1}{\sqrt{\varepsilon^2 + 3}} \end{pmatrix}^{\top} $$ and the eigendecomposition of $$S = \begin{pmatrix} \varepsilon^2 + 1 & 1 & 1 \\ 1 & \varepsilon^2 + 1 & 1 \\ 1 & 1 & \varepsilon^2 + 1 \end{pmatrix} = \begin{pmatrix} -1 & -1 & 1 \\ 0 & 1 & 1 \\ 1 & 0 & 1 \end{pmatrix} \begin{pmatrix} \varepsilon^2 & 0 & 0 \\ 0 & \varepsilon^2 & 0 \\ 0 & 0 & \varepsilon^2 + 3 \end{pmatrix} \cdot \frac{1}{3}\begin{pmatrix} -1 & -1 & 1 \\ -1 & 1 & -1 \\ 1 & 1 & 1 \end{pmatrix}. $$ Now if $\varepsilon > 0$ is small, terms of the form $\frac{3}{\sqrt{3 \varepsilon^2 + 9}}$ probably cause trouble, whereas the diagonalization (aka eigendecomposition) seems to be very easily computable. Therefore I expect the operation that yield those decompositions to be computationally more stable but I can't figure out how. What did I miss?

Ramanujan
  • 4,236
  • 5
  • 17
  • 43
  • @J. M.isapoormathematician as this is more a less a comment on your answer, I would like to to draw your attention to this post. – Ramanujan Feb 26 '20 at 18:22
  • in general when working numerically you never want to multiply a matrix by its transpose -- whether that's $XX^T$ or $X^T X$. E.g. you may have learned to do this with the normal equations and it is fine to think of them that way but in terms of actual numerical implementation, you want to run a routine that tacitly does this, not explicitly. Exact numerical reasons are complicated. It's not an apples to apples comparison here -- you didn't even select orthonormal eigenvectors for $S$, and in general numerical issues won't show up in symbolic computations. – user8675309 Feb 26 '20 at 19:32

0 Answers0