Given a matrix $M^{n \times n}$, I would like to decompose it into two smaller matrices $A^{n \times m}$ and $B^{m \times n}$, with $m < n$ so that the multiplication of both $AB = M'$ approximates $M$ as best as possible.

Is there any algorithm that does this? And computationally efficient?

As additional information, the values in $M$ are sparse and all positive.

  • 183
  • 6
  • The decomposition works only for matrices $M$ with $rank(M) \leq m$. I guess that there is an algorithm, but I can't help you with that. Eventually you could transform the problem into a linear system of equations. – Beni Bogosel Jul 11 '11 at 17:55
  • ...well, decomposition is perhaps the wrong name ...i just don't know how I should have called it else – dagnelies Jul 11 '11 at 18:04

2 Answers2


You can replace a matrix $M$ easily by two smaller matrices $A$ and $B$ such that $M'=AB$ is as close to $M$ as possible in the Frobenius norm. This can be achieve numerical efficiently and stable via calculating the singular value decomposition, $$M=U \Sigma V^\dagger.$$ You get the required decomposition by keeping only the $m$ largest singular values in $\Sigma$, and setting $$A=U', \qquad\text{and} \qquad B= \Sigma' V^\dagger;$$ here, $\Sigma'$ s the same matrix as $\Sigma$ except that it contains only the $m$ largest singular values, and $U'$ is the same as $U$ just retaining the rows corresponding to the largest singular values. More information can be found here.

  • 22,285
  • 45
  • 83
  • thanks, this seems to be exactly what I was looking for ...i'll just check it in details tomorrow, just to be sure – dagnelies Jul 11 '11 at 19:24

At first I deleted this answer when I saw Fabian's because obviously he knew what he was talking about and I wasn't :-). But I'm undeleting it because I realized that my method might be computationally more efficient for $m\ll n$, since you don't need to calculate the entire SVD of $M$, which takes $O(n^3)$ time; you only need to perform multiplications that take $O(mn^2)$ and inversions that take $O(m^3)$. You only need to calculate $M^TM$ once, and that doesn't take $O(n^3)$ since $M$ is sparse. If $M$ is sufficiently sparse, you can also treat $M^TM$ as sparse; then the time for the multiplications becomes linear in $n$.

P.S: The Wikipedia article on the SVD has a section "Truncated SVD" that says that the part of the SVD corresponding to the largest singular values can be calculated much more efficiently; so presumably the same savings apply there, too. I'm leaving the answer undeleted now but there's probably nothing of practical value to be gained from it if you know how to efficiently calculate the truncated SVD. (The Wikipedia article doesn't indicate how to do that, but there are lots of Google hits for "truncated SVD".)

If you mean "approximates $M$ as best as possible in a least-squares sense, this might lead to a feasible algorithm:

If you hold one of $A$ and $B$ fixed, this is a linear least-squares fit:

$$f (A) = \sum_{ik}\left(\sum_ja_{ij}b_{jk}-m_{ik}\right)^2\to\min$$ $$\frac{\partial f(A)}{\partial a_{ln}}=\sum_k \left(\sum_ ja_{lj}b_{jk}-m_{ik}\right)b_{nk}=0$$ $$A=MB^T(BB^T)^{-1}\;,$$

and analogously


So you might be able to alternatingly iterate towards the solution, but I have no idea how good the convergence would be. You can also substitute one of these into the other to get e.g.

$$ \begin{eqnarray} B &=& \left((BB^T)^{-1}BM^TMB^T(BB^T)^{-1}\right)^{-1}(BB^T)^{-1}BM^TM \\ &=& BB^T(BM^TMB^T)^{-1}BM^TM\;, \end{eqnarray} $$

which you can either regard as an iteration prescription (which cuts the number of matrix inversions in half) or try to solve directly for the fixed point (good luck).

By the way, $A$ and $B$ are only determined up to an invertible matrix, since $(AS^{-1})(SB)=AB$. You can see how that plays out in the above equations: If you start off with $SB$, you get $AS^{-1}$ and vice versa, and $S$ cancels out in the fixed-point equation for $B$.

Here's another way of thinking about this that might help: The columns of $M'$ are in the column space of $A$. So you're looking for $m$ column vectors to make up $A$ such that the sum of the squared distances of the columns of $M$ from the space spanned by these vectors is minimal. (In this view, optimizing $B$ determines the coefficients in the linear combinations of the column vectors of $A$ such that the nearest vector in the column space to each column vector of $M$ is used.) So what you're really looking for isn't the matrix $A$, but its column space, and this should be as close as possible to the column vectors of $M$. The matrix $S$ above just mixes the column vectors of $A$ but doesn't change the column space they span.

  • 215,929
  • 14
  • 263
  • 474