Recently, I answered this question about matrix invertibility using a solution technique I called a "miracle method." The question and answer are reproduced below:

Problem: Let $A$ be a matrix satisfying $A^3 = 2I$. Show that $B = A^2 - 2A + 2I$ is invertible.

Solution: Suspend your disbelief for a moment and suppose $A$ and $B$ were scalars, not matrices. Then, by power series expansion, we would simply be looking for $$ \frac{1}{B} = \frac{1}{A^2 - 2A + 2} = \frac{1}{2}+\frac{A}{2}+\frac{A^2}{4}-\frac{A^4}{8}-\frac{A^5}{8} + \cdots$$ where the coefficient of $A^n$ is $$ c_n = \frac{1+i}{2^{n+2}} \left((1-i)^n-i (1+i)^n\right). $$ But we know that $A^3 = 2$, so $$ \frac{1}{2}+\frac{A}{2}+\frac{A^2}{4}-\frac{A^4}{8}-\frac{A^5}{8} + \cdots = \frac{1}{2}+\frac{A}{2}+\frac{A^2}{4}-\frac{A}{4}-\frac{A^2}{4} + \cdots $$ and by summing the resulting coefficients on $1$, $A$, and $A^2$, we find that $$ \frac{1}{B} = \frac{2}{5} + \frac{3}{10}A + \frac{1}{10}A^2. $$ Now, what we've just done should be total nonsense if $A$ and $B$ are really matrices, not scalars. But try setting $B^{-1} = \frac{2}{5}I + \frac{3}{10}A + \frac{1}{10}A^2$, compute the product $BB^{-1}$, and you'll find that, miraculously, this answer works!

I discovered this solution technique some time ago while exploring a similar problem in Wolfram Mathematica. However, I have no idea why any of these manipulations should produce a meaningful answer when scalar and matrix inversion are such different operations. Why does this method work? Is there something deeper going on here than a serendipitous coincidence in series expansion coefficients?

David Zhang
  • 8,302
  • 2
  • 35
  • 56
  • @ASCIIAdvocate You're absolutely right. Fixed. – David Zhang Aug 16 '15 at 23:40
  • 1
    @DavidZhang: Did you try actually computing the series expansion of $B^{-1} I$? I haven't tried it but I suspect if you do, you would get what is on the right hand side without any miracles along the way. – user541686 Aug 16 '15 at 23:50
  • @Mehrdad I'm not sure I understand what you mean by "the series expansion of $B^{-1}I$." Is this something different than the power series I computed above? – David Zhang Aug 17 '15 at 00:06
  • 1
    @DavidZhang: The end result should be the same; what I'm saying is if you carry out the procedure assuming B is a matrix then I don't think anywhere during the procedure you will need a miracle in order to obtain the same result. – user541686 Aug 17 '15 at 00:10
  • @Mehrdad I must be missing something then, as I'm not sure what it means to expand a function mapping matrices to matrices in a power series. In my classes this procedure has only been defined for (and applied to) scalar functions $\mathbb{R} \to \mathbb{R}$ or $\mathbb{C} \to \mathbb{C}$, and the miraculous (to me) part is that it extends to matrices with (seemingly) no modification. – David Zhang Aug 17 '15 at 00:25
  • If the power series converges absolutely for |x| < R and all eigenvalues of the matrix are in the interior of the circle of convergence, then the same power series converges for the matrix, and the usual calculations with power series will work provided that this requirement is respected. As long as all power series use one matrix, or several mutually commuting matrices, there is no difficulty. – ASCII Advocate Aug 17 '15 at 05:57
  • The commutative ring $\mathbb{R}[x] / (x^3 - 2)$ may be relevant here. –  Aug 17 '15 at 06:08
  • When I looked at this question the first thing I thought was "Cayley-Hamilton", but nobody mentioned it. I haven't done matrix algebra in a year, but is there a connection here? – bright-star Aug 17 '15 at 08:12
  • Surely, we all understand that the field is real or complex in the context of the question, but as it stands, the problem statement is wrong if no further assumptions are imposed. E.g. with $A=3$ over $GF(5)$, we have $A^3=2$ but $B = A^2 - 2A + 2=0$. – user1551 Aug 17 '15 at 08:22
  • @user1551 You're absolutely right. Would it suffice to specify that the entries of $A$ live in a field of characteristic $0$? – David Zhang Aug 17 '15 at 16:35
  • Yes, it's true if the field has characteristic 0, but in this case I'm not so sure if the answers already given below apply. – user1551 Aug 18 '15 at 07:34
  • Closely related: [Matrix inverse identity](http://math.stackexchange.com/questions/356406/matrix-inverse-identity) (see Math Gems' answer) and [How would you solve this tantalizing Halmos problem?](http://mathoverflow.net/questions/31595/how-would-you-solve-this-tantalizing-halmos-problem) on MO (see the comments by Richard Stanley). – user1551 Aug 18 '15 at 07:46

8 Answers8


The real answer is the set of $n\times n$ matrices forms a Banach algebra - that is, a Banach space with a multiplication that distributes the right way. In the reals, the multiplication is the same as scaling, so the distinction doesn't matter and we don't think about it. But with matrices, scaling and multiplying matrices is different. The point is that there is no miracle. Rather, the argument you gave only uses tools from Banach algebras (notably, you didn't use commutativity). So it generalizes nicely.

This kind of trick is used all the time to great effect. One classic example is proving that when $\|A\|<1$ there is an inverse of $1-A$. One takes the argument about geometric series from real analysis, checks that everything works in a Banach algebra, and then you're done.

  • 747
  • 4
  • 10
Zach Stone
  • 5,381
  • 3
  • 20
  • 24

Think about how you derive the finite version of the geometric series formula for scalars. You write:

$$x \sum_{n=0}^N x^n = \sum_{n=1}^{N+1} x^n = \sum_{n=0}^N x^n + x^{N+1} - 1.$$

This can be written as $xS=S+x^{N+1}-1$. So you move the $S$ over, and you get $(x-1)S=x^{N+1}-1$. Thus $S=(x-1)^{-1}(x^{N+1}-1)$.

There is only one point in this calculation where you needed to be careful about commutativity of multiplication, and that is in the step where you multiply both sides by $(x-1)^{-1}$. In the above I was careful to write this on the left, because $xS$ originally multiplied $x$ and $S$ with $x$ on the left. Thus, provided we do this one multiplication step on the left, everything we did works when $x$ is a member of any ring with identity such that $x-1$ has a multiplicative inverse.

As a result, if $A-I$ is invertible, then

$$\sum_{n=0}^N A^n = (A-I)^{-1}(A^{N+1}-I).$$

Moreover, if $\| A \| < 1$ (in any operator norm), then the $A^{N+1}$ term decays as $N \to \infty$. As a result, the partial sums are Cauchy, and so if the ring in question is also complete with respect to this norm, you obtain

$$\sum_{n=0}^\infty A^n = (I-A)^{-1}.$$

In particular, in this situation we recover the converse: if $\| A \| < 1$ then $I-A$ is invertible.

  • 93,998
  • 3
  • 73
  • 140
  • All $\|A\| < 1$ ensures is that the series above is Cauchy. You also need your ring to be complete in the (norm) induced metric. Otherwise you don't know the series converges. – Zach Stone Aug 17 '15 at 00:58
  • @ZachStone Fair point, let me fix that. (I was really focusing on the algebra rather than the analysis here.) – Ian Aug 17 '15 at 01:23

The matrices commute. The rest is "functional calculus" (also called operator calculus) applied to A.

Think for example of how the calculation would look in a simultaneous eigenbasis for A and B. When the matrices commute there is a basis in which both are diagonal (or both in Jordan normal form). Then your operations are valid if they are valid when applied to each eigenvalue considered as a number.

ASCII Advocate
  • 2,394
  • 6
  • 14

Many (but not all!) things about scalar functions work with matrices through power series. Everything is easier when $A^n=A$ (I don't remember the name of this property) or the matrix is nilpotent ($A^n=0$). For the more numerically oriented people, I suggest "Computing matrix functions": http://dx.doi.org/10.1017/S0962492910000036

  • 3,165
  • 1
  • 7
  • 22
  • Isn't A^n = A idempotency? – bright-star Aug 17 '15 at 08:10
  • 1
    @Trevor Alexander That is what I thought, but it is defined as $A^2=A$ in https://en.wikipedia.org/wiki/Idempotent_matrix – Miguel Aug 17 '15 at 11:21
  • @MiguelAtencia $A^n = A \iff A^2 = A$ by a simple induction proof. – wchargin Aug 17 '15 at 16:11
  • 3
    @WChargin. Not so. The implication only works one way. I don't know how to post on this site, but just take A to be a 3-by-matrix whose minimal polynomial is x cubed minus x. (Rational canonical form will find it easily.) Then its cube will be A, but not its square. (Otherwise, it would satisfy a polynomial of degree 2.) – saulspatz Aug 17 '15 at 17:25
  • 2
    @saulspatz: right, of course! Thanks. $A^2 = A \implies A^n = A$ but not the other way around. – wchargin Aug 17 '15 at 17:45

The ordinary rules of arithmetic on $\mathbb R$ usually fail to apply to matrix algebra. Think about the non-commutativity of products, the possibility for the product of two non-zero matrices to be the zero matrix, etc. In general, one should be careful and resist the temptation to take such tricks as you just did too seriously.

However, there are many similarities between the underlying algebraic structures of matrix arithmetics and ordinary arithmetics (indeed, the latter can be conceived of as a special case of the former when all the matrices involved have only one row and column each). Therefore, the usual arithmetic tricks sometimes work, just as in your case. It may thus be fruitful to experiment with abusing the rules of matrix arithmetic using your intuition and then check rigorously afterwards whether the tentative results actually make sense.

  • 22,017
  • 3
  • 33
  • 79

It seems that he is doing this the hard way. If you know that $A^3 = 2I$, then any infinite sum of powers of $A$, if it converges, should reduce to a sum of the form $xI + yA + zA^2.$ So it seems reasonable to at least try to solve $$ (xI + yA + zA^2)(A^2 - 2A + 2I) = I$$ for $x, y,$ and $z$.

\begin{align} I &= (xI + yA + zA^2)(2I - 2A + A^2)\\ &= 2xI +(-2x + 2y)A + (x - 2y + 2z)A^2 + (y - 2z)A^3 + zA^4\\ &= 2xI +(-2x + 2y)A + (x - 2y + 2z)A^2 + (2y - 4z)I + 2zA\\ &= (2x + 2y - 4z)I + (-2x + 2y + 2z)A + (x - 2y + 2z)A^2 \end{align}

Solve \begin{align} 2x + 2y - 4z &= 1 \\ -2x + 2y + 2z &= 0 \\ x - 2y + 2z &= 0 \end{align}

and you get $x = \frac 25,\, y = \frac 3{10},\, z = \frac 1{10}$

Steven Alexis Gregory
  • 25,643
  • 4
  • 42
  • 84

The basic rule of thumb for such problems is that as long as all of the components of your equations are invertible and any multiplications used will commute, things will work just fine (this is the naive way of making the Banach Algebra comment). Since all along we knew we were going to find out that $B$ was invertible (and since powers of $A$ are clearly invertible), you're safe to go ahead.

However, these sorts of problems often have easier solutions. Hint: $B = A^2 - 2A + 2I = A^3 + A^2 - 2A = A(A^2 + A - 2I) = A(A+2I)(A-I)$. Now finish with applications of the Neumann Lemma.

When these types of problems also give you the information that $A$ is, say, symmetric or normal, or has such-and-such eigenvalues, you should look to abuse the spectral theorem to get very quick proofs by proving for diagonal, then generally.

Exercise: If $A$ is real normal s.t. $A^3 = A^2 +A - I$, prove that $A$ is invertible - in fact, $A^2 = I$.

John Samples
  • 601
  • 8
  • 32

If $A$ is any element in a $\mathbb{Q}$-algebra $\mathfrak{A}$ satisfying the equation $x^3-2\cdot1_{\mathfrak{A}}=0$, then $\mathbb{Q}[A]\cong\mathbb{Q}[\sqrt[3]{2}]$. Now you can use a standard technique for finding the inverse of an element in the field $\mathbb{Q}[\sqrt[3]{2}]$ in order to get the inverse of $B$ (it is clear that $gcd(x^3-2,x^2-2x+2)=1$, so that there exist $a(x),b(x)$ such that $1=a(x)(x^3-2)+b(x)(x^2-2x+2)$. Then $1=b(A)(A^2-2I+2)$).

  • 101
  • 2