181

I have two square matrices: $A$ and $B$. $A^{-1}$ is known and I want to calculate $(A+B)^{-1}$. Are there theorems that help with calculating the inverse of the sum of matrices? In general case $B^{-1}$ is not known, but if it is necessary then it can be assumed that $B^{-1}$ is also known.

Michael Hardy
  • 1
  • 30
  • 276
  • 565
Tomek Tarczynski
  • 2,524
  • 4
  • 18
  • 17

12 Answers12

167

In general, $A+B$ need not be invertible, even when $A$ and $B$ are. But one might ask whether you can have a formula under the additional assumption that $A+B$ is invertible.

As noted by Adrián Barquero, there is a paper by Ken Miller published in the Mathematics Magazine in 1981 that addresses this.

He proves the following:

Lemma. If $A$ and $A+B$ are invertible, and $B$ has rank $1$, then let $g=\operatorname{trace}(BA^{-1})$. Then $g\neq -1$ and $$(A+B)^{-1} = A^{-1} - \frac{1}{1+g}A^{-1}BA^{-1}.$$

From this lemma, we can take a general $A+B$ that is invertible and write it as $A+B = A + B_1+B_2+\cdots+B_r$, where $B_i$ each have rank $1$ and such that each $A+B_1+\cdots+B_k$ is invertible (such a decomposition always exists if $A+B$ is invertible and $\mathrm{rank}(B)=r$). Then you get:

Theorem. Let $A$ and $A+B$ be nonsingular matrices, and let $B$ have rank $r\gt 0$. Let $B=B_1+\cdots+B_r$, where each $B_i$ has rank $1$, and each $C_{k+1} = A+B_1+\cdots+B_k$ is nonsingular. Setting $C_1 = A$, then $$C_{k+1}^{-1} = C_{k}^{-1} - g_kC_k^{-1}B_kC_k^{-1}$$ where $g_k = \frac{1}{1 + \operatorname{trace}(C_k^{-1}B_k)}$. In particular, $$(A+B)^{-1} = C_r^{-1} - g_rC_r^{-1}B_rC_r^{-1}.$$

(If the rank of $B$ is $0$, then $B=0$, so $(A+B)^{-1}=A^{-1}$).

Michael Hardy
  • 1
  • 30
  • 276
  • 565
Arturo Magidin
  • 356,881
  • 50
  • 750
  • 1,081
  • 3
    Thanks, I was looking for something like this. – Tomek Tarczynski Jan 17 '11 at 15:18
  • 26
    The lemma is the [Sherman-Morrison formula](https://en.wikipedia.org/wiki/Sherman%E2%80%93Morrison_formula), isn't it? –  Apr 30 '14 at 08:15
  • 2
    Can this theorem be used in finding the inverse of $${\large[}g_{\mu\nu}+\chi \frac{k_\mu k_\nu}{k^2}{\large]}$$ where $g$ is the Minkowski metric tensor and the $k$'s are four-vectors? Please see this question in Physics.SE: http://physics.stackexchange.com/q/141613/31965 Thanks. – Physics_maths Oct 16 '14 at 15:59
  • 8
    What about the case of $ \left( A + \lambda I \right)^{-1} $? Let's assume $ A $ is PSD. – Royi Aug 22 '17 at 07:44
  • 2
    I am also interested in the case $(\mathbf{A}+\mathbf{I})^{-1}$. Please see https://math.stackexchange.com/questions/2680914/inverse-of-symmetric-matrix-plus-identity-matrix?noredirect=1#comment5539327_2680914 – TheDon Mar 09 '18 at 16:00
  • Related : https://math.stackexchange.com/q/2977195/2987 – Rajesh D Oct 30 '18 at 11:19
  • It may be useful to note that the rank of the outer product, $\mathbf{u}\mathbf{v^T}$, of nonzero vectors $\mathbf{u}$ and $\mathbf{v}$, is 1. So the lemma can be used in cases where $B=\mathbf{u}\mathbf{v^T}$, which may come up in linear regression settings when $B = \mathbf{x_i}\mathbf{x_i^T}$ – bob Apr 10 '20 at 16:44
  • That's a nice lemma. Used it to solve "inverse of a variance in random effects ANOVA" in 3 lines! – Ufos Apr 25 '21 at 14:45
  • 1
    @bob, every rank one matrix has that form (i.e. is outer product of two vectors). – Peter Morfe Jun 24 '21 at 16:43
62

It is shown in On Deriving the Inverse of a Sum of Matrices that

$(A+B)^{-1}=A^{-1}-A^{-1}B(A+B)^{-1}$.

This equation cannot be used to calculate $(A+B)^{-1}$, but it is useful for perturbation analysis where $B$ is a perturbation of $A$. There are several other variations of the above form (see equations (22)-(26) in this paper).

This result is good because it only requires $A$ and $A+B$ to be nonsingular. As a comparison, the SMW identity or Ken Miller's paper (as mentioned in the other answers) requires some nonsingualrity or rank conditions of $B$.

Shiyu
  • 4,740
  • 3
  • 29
  • 41
39

This I found accidentally.

Suppose given $A$, and $B$, where $A$ and $A+B$ are invertible. Now we want to know the expression of $(A+B)^{-1}$ without imposing the all inverse. Now we follow the intuition like this. Suppose that we can express $(A+B)^{-1} = A^{-1} + X$, next we will present simple straight forward method to compute $X$ \begin{equation} (A+B)^{-1} = A^{-1} + X \end{equation} \begin{equation} (A^{-1} + X) (A + B) = I \end{equation} \begin{equation} A^{-1} A + X A + A^{-1} B + X B = I \end{equation} \begin{equation} X(A + B) = - A^{-1} B \end{equation} \begin{equation} X = - A^{-1} B ( A + B)^{-1} \end{equation} \begin{equation} X = - A^{-1} B (A^{-1} + X) \end{equation} \begin{equation} (I + A^{-1}B) X = - A^{-1} B A^{-1} \end{equation} \begin{equation} X = - (I + A^{-1}B)^{-1} A^{-1} B A^{-1} \end{equation}

This lemma is simplification of lemma presented by Ken Miller, 1981

Muhammad Fuady
  • 603
  • 5
  • 9
  • 3
    Where did you find this? Can you give a citation? – Daniel Renshaw Jun 11 '13 at 10:55
  • 2
    How is this a simplification of the lemma shown in Ken Miller 1981? Are we talking about "On the Inverse of the Sum of Matrices" or any other work? (In any case, I find this property quite useful, just need to cite it properly). – Rufo Apr 10 '14 at 15:15
  • 2
    Interesting to notice that line 3 is a Sylvester equation. – ati Dec 11 '14 at 16:26
  • 6
    In order to conclude last line,we must have (I+A^-1B) invertible. So how are we sure about that, It might be easy but (I am not getting. Can you please explain @ Muhammad Fuday. – Sry Feb 16 '15 at 05:43
  • 2
    @Sry: I'm not certain how this formula helps. For example, the deduction $(I+A^{-1}B)^{-1} = (A+B)^{-1} A$ is direct, so the above formula is basically just the statement $(A+B)^{-1})(A+B)=I$. Among other things $I+A^{-1}B$ is invertible if and only if $A+B$ is invertible. i.e. you have to check invertibility of two equivalent matrices. – Ryan Budney Oct 29 '19 at 18:58
38

$(A+B)^{-1} = A^{-1} - A^{-1}BA^{-1} + A^{-1}BA^{-1}BA^{-1} - A^{-1}BA^{-1}BA^{-1}BA^{-1} + \cdots$

provided $\|A^{-1}B\|<1$ or $\|BA^{-1}\| < 1$ (here $\|\cdot\|$ means norm). This is just the Taylor expansion of the inversion function together with basic information on convergence.

(posted essentially at the same time as mjqxxx)

J. M. ain't a mathematician
  • 71,951
  • 6
  • 191
  • 335
Ryan Budney
  • 22,222
  • 3
  • 64
  • 104
32

I'm surprising that no one realize it's a special case of the well-known matrix inverse lemma or [Woodbury matrix identity], it says,

$ \left(A+UCV \right)^{-1} = A^{-1} - A^{-1}U \left(C^{-1}+VA^{-1}U \right)^{-1} VA^{-1}$ ,

just set U=V=I, it immediately gets

$ \left(A+C \right)^{-1} = A^{-1} - A^{-1} \left(C^{-1}+A^{-1} \right)^{-1} A^{-1}$ .

wayne
  • 754
  • 7
  • 15
21

A formal power series expansion is possible: $$ \begin{eqnarray} (A + \epsilon B)^{-1} &=& \left(A \left(I + \epsilon A^{-1}B\right)\right)^{-1} \\ &=& \left(I + \epsilon A^{-1}B\right)^{-1} A^{-1} \\ &=& \left(I - \epsilon A^{-1}B + \epsilon^2 A^{-1}BA^{-1}B - ...\right) A^{-1} \\ &=& A^{-1} - \epsilon A^{-1} B A^{-1} + \epsilon^2 A^{-1} B A^{-1} B A^{-1} - ... \end{eqnarray} $$ Under appropriate conditions on the eigenvalues of $A$ and $B$ (such that $A$ is sufficiently "large" compared to $B$), this will converge to the correct result at $\epsilon=1$.

mjqxxxx
  • 37,360
  • 2
  • 50
  • 99
  • 2
    The point about eigenvalues is apt, because this works even if $\|A^{-1}B\|\geq1$ and $\|BA{^-1}\|\geq1$ as long as the spectral radius of $A^{-1}B$ or $BA^{-1}$ is less than $1$. – Jonas Meyer Jan 17 '11 at 02:20
  • What about the case of $ \left( A + \lambda I \right)^{-1} $? Let's assume $ A $ is PSD. – Royi Aug 22 '17 at 07:45
  • Royi check Neumann series https://en.wikipedia.org/wiki/Neumann_series – fr_andres May 29 '21 at 05:37
13

Assuming everything is nicely invertible, you are probably looking for the SMW identity (which, i think, can also be generalized to pseudoinverses if needed)

Please see caveat in the comments below; in general if $B$ is low-rank, then you'd be happy using SMW.

  • It also requires $(A^{-1} + B^{-1})^{-1}$ to be known, doesn't it? – mjqxxxx Jan 16 '11 at 21:06
  • The Sherman-Morrison "update" formula is most efficient if $B$ is of low rank. So the usual application (rank one or two if symmetry is to be preserved) doesn't require $B^{-1}$ to exist. – hardmath Jan 16 '11 at 21:07
  • @mjqxxxx: yes, actually smw does require that inverse, which actually renders this answer useless, unless one is looking for inverses where $B$ is low-rank, and is written as $B=UCV^T$. –  Jan 16 '11 at 22:01
6

It is possible to come up with pretty simple examples where $A$,$A^{-1}$,$B$, and $B^{-1}$ are all very nice, but applying $(A+B)^{-1}$ is considered very difficult.

The canonical example is where $A = \Delta$ is a finite difference implementation of the Laplacian on a regular grid (with, for example, Dirichlet boundary conditions), and $B=k^2I$ is a multiple of the identity. The finite difference laplacian and it's inverse are very nice and easy to deal with, as is the identity matrix. However, the combination $$\Delta + k^2 I$$ is the Helmholtz operator, which is widely known as being extremely difficult to solve for large $k$.

Nick Alger
  • 16,798
  • 11
  • 59
  • 85
4

If A and B were numbers, there is no simpler way to write $\frac{1}{A+B}$ in term of $ \frac{1}{A}$ and $B$ so I don't know why you would expect there to be for matrices. It is even possible to have matrices, A and B, so that neither $A^{-1}$ nor $B^{-1}$ exist but $(A+B)^{-1}$ does or, conversely, such that both $A^{-1}$ and $B^{-1}$ exist but $(A+B)^{-1}$ doesn't.

syockit
  • 504
  • 1
  • 5
  • 14
leaveswater02
  • 518
  • 4
  • 3
3

Actually we can directly from @Shiyu answer about perturbations by subtracting $(A+B)^{-1}$ and factoring arrive at

$$0=A^{-1}-(A^{-1}B+I)(A+B)^{-1}$$ followed by$$(A+B)^{-1}=(A^{-1}B+I)^{-1}A^{-1}$$

And by symmetry of course

$$(A+B)^{-1}=(B^{-1}A+I)^{-1}B^{-1}$$

Now remember, $(I+X)^{-1}$ can be expanded as $I-X+X^2+\cdots$ by geometric series.

So if $X=B^{-1}A$ or $X=A^{-1}B$ and multiplication by $A,B$ and either of $A^{-1}$ or $B^{-1}$ are cheap, then this could work nicer than some other method of finding inverse.

mathreadler
  • 24,082
  • 9
  • 33
  • 83
2

Extending Muhammad Fuady's approach: We have: \begin{equation} (A+B)^{-1} = A^{-1} + X \end{equation} \begin{equation} X = - (I + A^{-1}B)^{-1} A^{-1} B A^{-1} \end{equation} So \begin{equation} (A+B)^{-1} = A^{-1} - (I + A^{-1}B)^{-1} A^{-1} B A^{-1} \tag{1}\label{eq1} \end{equation} This rearranges to: \begin{equation} (A+B)^{-1} = (I - (I + A^{-1}B)^{-1} A^{-1} B )A^{-1} \tag{2}\label{eq2} \end{equation} If we consider the part \begin{equation} (I + A^{-1}B)^{-1} \end{equation} Then, this is an inverse of a sum of two matrices, so we can use \eqref{eq2}, setting $A=I$ and $B = A^{-1}B$, this gives: \begin{equation} (I + A^{-1}B)^{-1} = (I - (I + A^{-1}B)^{-1}A^{-1}B ) \end{equation} so we can substitute the LHS of this for the right hand side which appears in \eqref{eq2}, giving: \begin{equation} (A+B)^{-1} = (I + A^{-1}B)^{-1}A^{-1} \tag{3}\label{eq3} \end{equation} Which is simpler than \eqref{eq1} and is very similar to the scalar identity: \begin{equation} \frac{1}{a+b}=\frac{1}{\left(1+\frac{b}{a}\right)a} \tag{4}\label{eq4} \end{equation}

The technique is useful in computation, because if the values in A and B can be very different in size then calculating $\frac{1}{A+B}$ according to \eqref{eq3} gives a more accurate floating point result than if the two matrices are summed.

Dan
  • 21
  • 2
2

I know the question has been answered multiple times with great answers, but with my answer you don't need to memorize any lemmas or formulas.

Suppose $(A+B)x=y$, then $x=(A+B)^{-1}y$. This is all we need to get. The steps are:

(1) Start with $(A+B)x=y$.

(2) Then $Ax=y-Bx$, so $x=A^{-1}y -A^{-1}Bx$.

(3) Multiply $x$ in step (2) by $B$ to get $$Bx=BA^{-1}y -BA^{-1}Bx$$ which is equivalent to $$(I+BA^{-1})Bx=BA^{-1}y $$ or, $$Bx=(I+BA^{-1})^{-1}BA^{-1}y $$

(3) Substitute this $Bx$ into the $x$ in step (2) to get $$x=A^{-1}y -A^{-1}(I+BA^{-1})^{-1}BA^{-1}y $$

(4) Now factorizing the $y$ gives you the required result. $$x=(A^{-1} -A^{-1}(I+BA^{-1})^{-1}BA^{-1})y $$

(5)The assumptions we have used are $A$ and $I+BA^{-1}$ are nonsingular.

(6) We can factorize the $A^{-1}$ to get: $$(A+B)^{-1}=A^{-1}(I -(I+BA^{-1})^{-1}BA^{-1})$$

Asad Mehasi
  • 522
  • 4
  • 7