If $A$ and $B$ are square matrices such that $AB = I$, where $I$ is the identity matrix, show that $BA = I$.

I do not understand anything more than the following.

  1. Elementary row operations.
  2. Linear dependence.
  3. Row reduced forms and their relations with the original matrix.

If the entries of the matrix are not from a mathematical structure which supports commutativity, what can we say about this problem?

P.S.: Please avoid using the transpose and/or inverse of a matrix.

Rodrigo de Azevedo
  • 18,977
  • 5
  • 36
  • 95
  • 5,747
  • 6
  • 25
  • 36
  • 11
    See also [Dedekind finite rings.](http://math.stackexchange.com/q/3852/242) – Bill Dubuque Apr 29 '12 at 20:41
  • 35
    Suppose $U$ and $V$ are sets and $T:V \to W$ is a function. Then $T$ has a left inverse $\iff$ $T$ is one-to-one, and $T$ has a right inverse $\iff$ $T$ is onto. Suppose now that $V$ is a finite dimensional vector space, and $T:V \to V$ is a linear transformation. Then $T$ has a left inverse iff $T$ is one-to-one iff $T$ is onto iff $T$ has a right inverse. From here, it's easy to show that any left inverse must also be a right inverse. (If $LT = I$ and $TR = I$, then $LTR = L \implies R = L$.) – littleO Jun 19 '14 at 23:24
  • @littleO: Is your $U$ meant $W$? – linear_combinatori_probabi Oct 25 '18 at 14:27
  • 6
    @BillDubuque what's up with the link? It's linking back to this question. – SOFe Feb 27 '19 at 02:49
  • 2
    @SOFe Alas, after 7 years, I don't recall what link was actually intended there. But hopefully the name will aid in keyword searches. – Bill Dubuque Feb 27 '19 at 03:02
  • 1
    A comment to [Why are Dedekind-finite rings called so?](https://math.stackexchange.com/questions/33491/why-are-dedekind-finite-rings-called-so) has a link back to this question. Perhaps that was the link intended in the first comment here. – robjohn Jun 24 '20 at 06:47

33 Answers33


Dilawar says in 2. that he knows linear dependence! So I will give a proof, similar to that of TheMachineCharmer, which uses linear independence.

Suppose each matrix is $n$ by $n$. We consider our matrices to all be acting on some $n$-dimensional vector space with a chosen basis (hence isomorphism between linear transformations and $n$ by $n$ matrices).

Then $AB$ has range equal to the full space, since $AB=I$. Thus the range of $B$ must also have dimension $n$. For if it did not, then a set of $n-1$ vectors would span the range of $B$, so the range of $AB$, which is the image under $A$ of the range of $B$, would also be spanned by a set of $n-1$ vectors, hence would have dimension less than $n$.

Now note that $B=BI=B(AB)=(BA)B$. By the distributive law, $(I-BA)B=0$. Thus, since $B$ has full range, the matrix $I-BA$ gives $0$ on all vectors. But this means that it must be the $0$ matrix, so $I=BA$.

  • 4,151
  • 1
  • 17
  • 12
  • 5
    You use the fact that a $n$-dimensional subspace of a $n$-dimensional vector space coindices with the vector space. This is not clear at all from the definitions, is false for cardinals $n$, and again uses some chain argument or similar arguments. – Martin Brandenburg Sep 02 '10 at 08:26
  • 4
    @MB: I think this argument is okay. It is very standard in intro linear algebra classes to prove that in a finite-dimensional vector space every spanning set contains a basis and every LI set can be enlarged to a basis. Moreover the proofs are algorithmic and the algorithm is always the same: row reduction. – Pete L. Clark Sep 02 '10 at 08:38
  • 3
    @Martin: I'm assuming $n$ is finite - maybe I should have been more explicit. I was assuming that the fact about $n$ dimensional subspaces coinciding with the original space was elementary enough to fit under his requirements. – Davidac897 Sep 02 '10 at 08:45
  • 1
    Besides, Martin's proof also uses this fact (how else can you show that the chain is stationary in a finite-dimensional vector space?). – Davidac897 Sep 02 '10 at 09:16
  • 3
    Yes I also use this or related statements and we have to use something like that; I just wanted to make precise for the readers what you are using. – Martin Brandenburg Sep 02 '10 at 11:14
  • 1
    Thank you David. This is what exactly I've been looking for. @MB : I really appreciate the way you have formulate the solutions. – Dilawar Sep 02 '10 at 11:56
  • Why $AB$ has full rank imply $B$ has full rank ? I would say if $AB=I$ then $B$ in one to one and $A$ is full rank (if $f$ invertible at right, then $f$ is surjective, and if $f$ invertible at left, then $f$ in injective). @Davidac897 – Peter Jul 26 '18 at 21:30
  • @Peter The subsequent sentence explains how the implication works. – Davidac897 Aug 05 '18 at 14:53

So you want to find a proof of this well-known fact, which avoids the usual "indirect" proofs? I've also pondered over this some time ago.

We have the following general assertion:

Let $M$ be a finite-dimensional $K$-algebra, and $a,b \in M$ such that $ab=1$, then $ba=1$. [For example, $M$ could be the algebra of $n \times n$ matrices]

Proof: The sequence of subspaces $\cdots \subseteq b^{k+1} M \subseteq b^k M \subseteq \cdots \subseteq M$ must be stationary, since $M$ is finite-dimensional. Thus there is some $k$ and some $c \in M$ such that $b^k = b^{k+1} c$. Now multiply with $a^k$ on the left to get $1=bc$. Then $ba=ba1 = babc=b1c=bc=1$. QED

No commutativity condition is needed. The proof shows more general that the claim holds in every left- or right-artinian ring $M$.

Remark that we needed, in a essential way, some finiteness condition. There is no purely algebraic manipulation with $a,b$, which shows $ab = 1 \Rightarrow ba=1$ (and shift operators provide a concrete counterexample). Every argument uses some argument of the type above. For example when you want to argue with linear maps, you have to know that every subspace of a finite-dimensional(!) vector space of the same dimension actually is the whole vector space, for which there is also no "direct" proof. I doubt that there is one.

PS. See here for a proof of $AB=1 \Rightarrow BA=1$ for square matrices over a commutative ring.

Martin Brandenburg
  • 146,755
  • 15
  • 248
  • 458
  • Thanks Martin. Though I have to get accustomed to some of the words you used but certainly, this improved my understanding and I have one more approach in my list. – Dilawar Sep 02 '10 at 07:41
  • 11
    @MB: This is a very nice answer. A couple of tips: (i) it took me several readings to gather that "quadratic matrix" = "square matrix". This is included in the question, so it seems best to just assume it. (ii) I think you have some $A$'s which should be $a$'s (or vice versa). (iii) Since the OP asked for an elementary answer, please consider rewriting it to first treat exactly the case s/he asked for with no "fancy language" (ideals, Artinian algebras, etc.). Afterwards you can comment on how general your argument can be made. – Pete L. Clark Sep 02 '10 at 08:23
  • 21
    "The claim is also true for non-square matrices". No it is not. – Laurent Lessard Sep 02 '10 at 08:41
  • 1
    Thanks for the suggestions. I've edited my answer. – Martin Brandenburg Sep 02 '10 at 12:51
  • Compare to my proof here. – Bill Dubuque Sep 02 '10 at 14:11
  • @Pete: these are folklore results, e.g. see the references in my post to papers of Vasconcelos et al from the seventies (but the results were surely known before then). – Bill Dubuque Sep 02 '10 at 14:18
  • But you are still implicitely using some nontrivial statements from linear algebra such as dimension n subspace of dimension n space is the full space. Otherwise, how do you justify finiteness of the chain? – Peter Franek Aug 23 '20 at 15:18
  • I use that the dimension is a well-defined and that a proper subspace of a vector space has a smaller dimension. You call this non-trivial? – Martin Brandenburg Jul 02 '21 at 21:13

If $\rm\,B\,$ is a linear map on a finite dimensional vector space $\rm\, V\,$ over field $\rm\,K,\,$ then easily by finite dimensionality (cf. Note below) there is a polynomial $\rm\,0\ne p(x)\in K[x]\;$ with $\rm\,p(B) = 0.\,$ W.l.o.g.$\rm\,p(0) \ne 0\,$ by canceling any factors of $\rm\,B\;$ from $\rm\;p(B)\;$ by left-multiplying by $\rm A,\,$ using $\rm\, AB = 1.$

Notice $\rm\ AB=1 \, \Rightarrow\, (BA-1)\, B^n =\, 0\;$ for $\,\rm\;n>0\;$

so by linearity $\rm\, 0 \,=\, (BA-1)\ p(B)\, =\, (BA-1)\ p(0) \;\Rightarrow\; BA=1 \quad\quad$ QED

This is essentially a special case of computing inverses by the Euclidean algorithm - see my Apr 13 1999 sci.math post on Google or mathforum.

Note $ $ The existence of $\rm\;p(x)\;$ follows simply from the fact that $\rm\,V\;$ finite-dimensional implies the same for the vector space $\rm\,L(V)\,$ of linear maps on $\rm\,V\,$ (indeed if $\rm\,V\;$ has dimension $\rm n$ then a linear map is uniquely determined by its matrix of $\,\rm n^2\,$ coefficients). So $\rm\, 1,\, B,\, B^2,\, B^3,\,\cdots\;$ are $\rm\,K$-linearly dependent in $\rm\, L(V)$ which yields the sought nonzero polynomial annihilating $\rm\,B.$

Bill Dubuque
  • 257,588
  • 37
  • 262
  • 861
  • 2
    Nice proof, but why do you write a proof of BA=I⇒AB=I when the question asks a proof of AB=I⇒BA=I? (Of course this difference is inessential.) – Tsuyoshi Ito Sep 03 '10 at 10:21
  • 4
    I think it's the best answer so far. It only uses the fact that a system of linear equations with more unknowns than equations has a nontrivial solution. Alternative wording: There is a polynomial q such that q(B) = 1, q(0) = 0. Then BA-1 = (BA-1) q(B) = 0. – Pierre-Yves Gaillard Sep 03 '10 at 14:18
  • 3
    Thanks for the comments. I swapped A,B to match the OP's notation. My posts here are excerpts of my linked sci.math posts (which had A,B swapped). They're part of a handful of posts that I composed to provide different perspectives on this FAQ. – Bill Dubuque Sep 03 '10 at 14:56
  • The Google Groups link no longer works, it seems; here's a [replacement](https://groups.google.com/d/msg/sci.math/sETwIb1Ub7Y/q0X6yugG7voJ). –  Feb 17 '14 at 17:58
  • 1
    The proof shows: If $A$ is a $K$-algebra containing two elements $a,b$ such that $ab=1$ and $b$ is algebraic over $K$, then $ba=1$. – Martin Brandenburg Aug 05 '14 at 20:44
  • @Martin Right. In attempt to make the answer as accessible as possible, I purposely avoided any further abstraction. – Bill Dubuque Aug 05 '14 at 21:07
  • 1
    I know this an old post, but maybe somebody can answer the following. We can only say that $(BA-1)\ p(0) = 0 \;\Rightarrow\; BA=1$ if we know that $p(0)$ has full row rank (i.e. $p(0)$ is not a zero-divisor from the right). However, all we know about $p(0)$ per the setup is that $p(0) \neq 0$. Is this a flaw in the proof? If so, is there a way to fix the proof? – Ben Grossmann Mar 25 '20 at 18:19
  • @BenGrossmann $p\in K[x]$ so $p(0_K)\in K$ and $p(0_{\operatorname{End}(V)})=p(0_K)\cdot \operatorname{id}_V$. – Christoph Nov 17 '20 at 09:55
  • @Christoph Not sure how I had missed that, thanks. – Ben Grossmann Nov 17 '20 at 14:04
  • Dear @BillDubuque, it seems that we really only need the lowest degree coefficient of the polynomial to be cancellable. If so, does the assertion hold for matrices over any (commutative) domain (by taking the characteristic polynomial and using Cayley-Hamilton)? – Arrow Oct 08 '21 at 22:38

Let $x_1, x_2, \dots, x_n$ be a basis of the space. At first we prove that $Bx_1, Bx_2, \dots, Bx_n$ is also a basis. To do it we need to prove that those vectors are linearly independent. Suppose it's not true. Then there exist numbers $c_1, c_2, \dots, c_n$ not all equal to zero such that $$c_1 Bx_1 + c_2 Bx_2 + \cdots + c_n B x_n = 0.$$ Multiplying it by $A$ from the left, we get $$c_1 ABx_1 + c_2 ABx_2 + \cdots + c_n ABx_n = 0,$$ hence $$c_1 x_1 + c_2 x_2 + \cdots + c_n x_n = 0$$ and so the vectors $x_1, x_2, \dots, x_n$ are also linearly dependent. Here we get contradiction with assumption that the vectors $x_i$ form a basis.

Since $Bx_1, Bx_2, \dots, Bx_n$ is a basis, every vector $y$ can be represented as a linear combination of those vectors. This means that for any vector $y$ there exists some vector $x$ such that $Bx = y$.

Now we want to prove that $BA = I$. It is the same as to prove that for any vector $y$ we have $BAy = y$. Now given any vector $y$ we can find $x$ such that $Bx = y$. Hence $$BAy = BABx = Bx = y$$ by associativity of matrix multiplication.

  • 3,990
  • 20
  • 25
  • Very nice!!! I just suggest that you replace "from the leftwe get" with "from the left we get", and "and so vectors" with "and so the vectors". – Pierre-Yves Gaillard Sep 02 '10 at 16:37
  • @Pierre-Yves Gaillard: Corrected, thank you. – falagar Sep 02 '10 at 16:39
  • Thanks falagra ! I almost got it in the first read itself. :-) I think you missed something to point out in the first paragraph after last line. I think you need to add one more line stating 'A contradiction.'. – Dilawar Sep 02 '10 at 17:10
  • 6
    That's essentially the proof that I gave - translated into the language of coordinates (bases). However, adding such extraneous information as bases only serves to obfuscate the simple essence of the manner, namely: **injective maps cannot decrease heights** (here = dimension = length of max subspace chain). As I stress in my post, the proof is a very intuitive **one-line pigeonhole squeeze** when viewed this way. – Bill Dubuque Sep 02 '10 at 18:23
  • 21
    @Bill, the thing is, «the essence of the matter» is not always, and is not for everybody, the best way to see something; obfuscation is in the eye of the beholder. I enjoyed reading through each of your proofs, but I find it quite natural that they be classified in the may-be-by-the-end-of-this-semester-I'll-be-able-to-understand-it category! – Mariano Suárez-Álvarez Sep 08 '10 at 15:03
  • 1
    @Mariano. But it is important to point out the essence of the matter since rarely do textbook authors do so. Even if a student is not able to completely grasp the essence now, the seed will have been planted to make the "Aha!" moment germinate when the time is ripe. See my latest post here where I have stressed more clearly the innate essence, namely that 1-1 not onto maps lead to infinite descending subspace chains, i.e. **Dedekind infinite => infinite dimension**. This can easily be comprehended by a novice if explained very carefully. – Bill Dubuque Sep 08 '10 at 15:29
  • I'm presenting this in class soon, and this is basically what I plan to say. But the first part can be phrased just in terms of row reduction (which makes it fit earlier in a course). Consider the equation Bx=0. Multiplying by A, we have ABx=0, i.e. x = 0. So B has linearly independent columns, and its reduced eschelon form is the identity matrix. That means the equation Bx=z always has a solution, and now you can continue as above. – Dan Ramras Sep 22 '10 at 08:13

Since there seem to be some lingering beliefs that an approach which does not make explicit use of the finite-dimensionality could be valid, here is a familiar counterexample in the infinite dimensional case.

Let $V = \mathbb{R}[t]$ be the vector space of real polynomial functions. Let $B: V \rightarrow V$ be differentiation: $p \mapsto p'$, and let $A: V \rightarrow V$ be anti-differentation with constant term zero: $\sum_{n=0}^{\infty} a_n t^n \mapsto \sum_{n=0}^{\infty} \frac{a_n}{n+1} t^{n+1}$.

These are both $\mathbb{R}$-linear maps and $B \circ A$ is the identity, but $A \circ B$ is not (the constant term is lost).

Pete L. Clark
  • 93,404
  • 10
  • 203
  • 348
  • 23
    1+. Then I also add the following example, because I think it is easier (but very similar): Consider the vector space of infinite sequences (of rational numbers, say), and the linear maps $A : (a_0,a_1,...) \to (0,a_0,a_1,...), B : (a_0,a_1,...) \to (a_1,a_2,...)$. Then $BA=1$, but not $AB=1$ since $AB$ maps $(a,0,...) \mapsto 0$. You may also restrict to finite sequences (without fixed lenght) and then cook up infinite matrices satisfying the properties. – Martin Brandenburg Sep 02 '10 at 08:19
  • 2
    @MB: Right, it's similar -- and indeed, also has a polynomial interpretation, as I'm sure you know. – Pete L. Clark Sep 02 '10 at 08:24
  • see my post here which makes obvious the crucial role of finite dimensionality. See esp. the linked sci.math post. – Bill Dubuque Sep 02 '10 at 14:24
  • 1
    @MB: Polynomials give a nice concrete model: Q[x] where the right shift is R = x = multiplication by x, and left shift L = (f(x)-f(0)/x. Then LR = I but RL f(x) = f(x)-f(0), so RL != I. For more see my old post http://bit.ly/Shift1-1notOnto – Bill Dubuque Sep 02 '10 at 19:24
  • 1
    The prior link is now stale. Here are working versions: [mathforum](http://mathforum.org/kb/thread.jspa?forumID=13&threadID=2113688&messageID=7172675) or [Google groups](https://groups.google.com/d/msg/sci.math/XoyYJWrpXew/KaZcp4FqZ18J) – Bill Dubuque Aug 01 '17 at 13:22

It follows by the pigeonhole principle. Here's an excerpt from my Dec 11 2007 sci.math post:

Recall (proof below) $\rm\; AB \:\:=\:\:\: I \:\;\Rightarrow\; BA \:\:=\:\: I\;\;\:$ easily reduces to:

THEOREM $\;$ $\rm\;\;B\;$ injective $\rm\;\Rightarrow\:\: B\;$ surjective, $\:$ for linear $\rm\:B\:$ on a finite dim vector space $\rm\:V$

Proof $\rm\ \ \ B\;$ injective $\rm\;\Rightarrow\ B\;$ preserves injections: $\rm\;R < S \;\Rightarrow\; BR < BS\;$
Hence for $\rm\;\;\; R \;\: < \;\; S < \cdots < \; V\;\;$ a chain of maximum length (= dim $\rm V\:$)
its image $\rm\;BR < BS < \cdots < BV \le V\;\;\:\;$ would have length greater
if $\rm\ BV < V\:,\: $ hence, instead $\rm\:\:\:\ \ BV = V\:,\;\:$ i.e. $\rm\; B \;$ is surjective. $\;$ QED

Notice how this form of proof dramatically emphasizes the essence of the matter, namely that
injective maps cannot decrease heights (here = dimension = length of max subspace chain).

Below is said standard reduction to xxxjective form. See said sci.math post for much more,
including references to folklore generalizations, e.g. work of Armendariz and Vasconcelos in the seventies.

First, notice that $\rm\;\;\ AB = I \;\Rightarrow\: B\:$ injective, since $\rm\;A\;$ times $\rm\;B\:x = B\:y\;$ yields $\rm\;x = y\:,\:$ and

$\rm\ B\ $ surjective $\rm\ \Rightarrow\ BA = I \;\;$ since for all $\rm\;x\;$ exists $\rm\; y : \;\; x = B\:y = B(AB)\:y = BA \: x$

Combining them: $\rm\: AB = I \;\Rightarrow\: B\:$ injective $\rm\;\Rightarrow\; B\:$ surjective $\rm\;\Rightarrow\: BA = I$

Bill Dubuque
  • 257,588
  • 37
  • 262
  • 861
  • May be by the end of this semester, I'll be able to understand it. I have added it into 'things to understand' list. Thanks! – Dilawar Sep 02 '10 at 14:49
  • @Dilawar: I've reformulated this proof to make it clearer. Please see my other post here which shows that it essentially reduces to Hilbert's infinite hotel on subspaces, viz. http://math.stackexchange.com/questions/4252 – Bill Dubuque Sep 08 '10 at 15:35
  • This potentially works over any commutative ring, not just a field, right? Explicitly, I think the following holds: Suppose $R$ is a commutative ring and that $V$ is an $R$-module that has a finite chain of subspaces of maximum length among all chains of subspaces. Then left-injective elements of $\mathrm{End}(V)$ should also be left-surjective. – goblin GONE Sep 22 '15 at 06:59

Motivation $\ $ If vector space $\rm V\:$ has a $1$-$1$ map $\rm\,B\,$ that's not onto, i.e. $\rm V > BV\:,\:$ then this yields an $\infty$ descending chain of subspaces by $\rm V > \: BV > \;\cdots\: > B^i V$ by repeatedly applying $\rm B\:$.

Theorem $\rm\:\ AB = 1 \;\Rightarrow\; BA=1\:$ for linear maps $\rm\:A,B\:$ on a finite dimensional vector space $\rm\: V $

Proof $\rm\;\; V > BV, \;$ times $\rm\; B^i\:\Rightarrow\: B^i V > B^{i+1} V \;$ (else $\rm\; B^i V = B^{i+1} V, \;$ times $\rm\; A^i \Rightarrow V = BV)$

$\rm\ \ \ \ \Rightarrow\rm\;\;\; V > BV > B^2 V > \cdots \:$ is an $\infty$ descending chain $\rm\; \Rightarrow\; dim\: V = \infty\,\:$ contra hypothesis.

Hence $\rm\ \ \ V = BV \;\Rightarrow\; (BA\!-\!1)V = (BA\!-\!1)BV = B(AB\!-\!1)V = 0 \quad$ QED

Remark $\;\;$ Hence vector space $\rm\:V\;$ has infinite dimension $\rm\;\iff V\:$ is Dedekind infinite, i.e. $\rm\:V\:$ has an isomorphic proper subspace $\rm BV,\:$ viz. the theorem proves $(\Leftarrow)$ and the converse follows by taking $\rm B\:$ to be a basis shift map $\rm\; (v_1,v_2,v_3,\cdots\:)\:\to\: (0,v_1,v_2,v_3,\cdots\:)\:,\;$ i.e. said simply, a vector space is infinite iff it has a subspace chain model of Hilbert's infinite hotel. $\:$ That is the simple essence of the matter.

Bill Dubuque
  • 257,588
  • 37
  • 262
  • 861
  • @Dilawar / readers: This is a reformulation of my prior pigeonhole proof intended to make clearer the essence of the matter. Please feel free to ask questions if anything is not clear. – Bill Dubuque Sep 08 '10 at 15:39

I prefer to think in terms of linear operators rather than matrices. A function has a right inverse iff it is surjective and it has a left inverse iff it is injective. For a linear operator, this means that having a right inverse is equivalent to having range equal to the entire space and having a left inverse is equivalent to having trivial kernel. For a linear operator on a finite-dimensional space, the dimension of its kernel + the dimension of its range = the dimension of the entire space, so this does the job.

  • 5,488
  • 28
  • 30

For a more elementary treatment ...

Fact. If the rows of $A$ are linearly dependent, then the rows of $A B$ are linearly dependent.

Proof of fact. Consider the 3x3 case, where the linearly-dependent rows of $A$ are $\mathbf{a}_1$, $\mathbf{a}_2$, $\mathbf{a}_3 = h \mathbf{a}_1 + k \mathbf{a}_2$ (for some scalars $h$ and $k$):

$$A = \begin{bmatrix}\mathbf{a}_1 \\ \mathbf{a}_2 \\ h\mathbf{a}_1 + k\mathbf{a}_2\end{bmatrix} = \begin{bmatrix}p & q & r \\ s & t & u \\ hp + ks & hq + kt & hr + ku \end{bmatrix}$$

Writing $\mathbf{b}_1$, $\mathbf{b}_2$, and $\mathbf{b}_3$ for the rows of $B$, we have

$$A B = \begin{bmatrix}p & q & r \\ s & t & u \\ hp + ks & hq + kt & hr + ku \end{bmatrix} \begin{bmatrix}\mathbf{b}_1 \\ \mathbf{b}_2 \\ \mathbf{b}_3\end{bmatrix} = \begin{bmatrix}p \mathbf{b}_1+ q\mathbf{b}_2 + r\mathbf{b}_3 \\ s\mathbf{b}_1 + t\mathbf{b}_2 + u\mathbf{b}_3 \\ (hp + ks)\mathbf{b}_1 + (hq + kt)\mathbf{b}_2 + (hr + ku)\mathbf{b}_3 \end{bmatrix}$$

$$= \begin{bmatrix}p \mathbf{b}_1+ q\mathbf{b}_2 + r\mathbf{b}_3 \\ s\mathbf{b}_1 + t\mathbf{b}_2 + u\mathbf{b}_3 \\ h(p\mathbf{b_1}+q\mathbf{b}_2+r\mathbf{b}_3) + k(s\mathbf{b}_1 + t\mathbf{b}_2 + u\mathbf{b}_3) \end{bmatrix}$$

Generally, the linear dependence of the rows of $A$ carries over to the rows of the product, proving our Fact. (This reasoning actually shows the more-precise Fact that $rank(AB)\le rank(A)$.)

We can restate the Fact this way:

Re-Fact. If the rows of $AB$ are linearly independent, then the rows of $A$ are linearly independent.

To your question: If $A B = I$, then (by the Re-Fact) the rows of $A$ must be linearly independent. This implies that $A$ can be row-reduced to a diagonal matrix with no zero entries on that diagonal: the row-reduced form of $A$ must be the Identity matrix.

Note that row-reduction is actually an application of matrix multiplication. (You can see this in the equations above, where (left-)multiplying $B$ by $A$ combined the rows of $B$ according to the entries in the rows of $A$.) This means that, if $R$ is the result of some row combinations of $A$, then there exists a matrix $C$ that "performed" the combinations:

$$C A = R$$

If (as in the case of your problem) we have determined that $A$ can be row-reduced all the way down to the Identity matrix, then the corresponding $C$ matrix must be a (the) left-inverse of $A$:

$$C A = I$$

It's then straightforward to show that left and right inverses of $A$ must match. This has been shown in other answers, but for completeness ...

$$A B = I \;\; \to \;\; C (A B) = C \;\; \to \;\; (C A) B = C \;\; \to \;\; I B = C \;\; \to \;\; B = C$$

Once you start thinking (ahem) "outside the box (of numbers)" to interpret matrices as linear transformations of vectors and such, you can interpret this result in terms of mapping kernels and injectivity-vs-surjectivity and all the kinds of sophisticated things other answers are suggesting. Nevertheless, it's worth noting that this problem is solvable within the realm of matrix multiplication, plain and simple.

  • 69,412
  • 11
  • 108
  • 217
  • 2
    +1, and to add: Row reduction/Gaussian elimination/LU decomposition is just a left multiplication of a matrix by a sequence of so-called "Gauss transforms", which are low-rank corrections to the identity matrix. Nothing sledgehammer-y about it! – J. M. ain't a mathematician Sep 03 '10 at 02:12

Coincidentally, a totally unrelated MathSciNet search turned up this article, which gives a result along the lines of (but slightly stronger than) the one in Martin Brandenburg's answer.

In particular:

Theorem (Hill, 1967): Let $R$ be a ring satisfying the ascending chain condition on right ideals. Then:
a) For any $n \in \mathbb{Z}^+$, the matrix ring $M_n(R)$ also satisfies ACC on right ideals.
b) If $a,b \in R$ are such that $ab = 1$, then also $ba = 1$.

Pete L. Clark
  • 93,404
  • 10
  • 203
  • 348

Disclaimer: The following proof was written by me, but the main idea was given to me by a friend in some discussion, and he probably proved it too.

The main idea is that this claim is true when $A$ is an elementary row matrix, and we can induct on the number of row operations needed to be done on $A$ in order to reduce it.

  • Some background:

    An elementary row-operation matrix $E$ corresponds to one of 3 operations: row addition, row multiplication and row switching. Every $E$ has a matrix $E'$ which is another row-operation matrix of the same type, such that $EE'=E'E=I$ (you don't need to term "inverse of a matrix" to believe this - this "inverse" $E'$ is described here and you can verify it yourself).

    By performing elementary row operations on a matrix $A$, one gets the following equality: $E_{k} E_{k-1} \cdots E_1 A = C$, where $C$ is in row reduced form and $E_i$ are elementary row matrices.

Claim 1: $A$ can be written as $E_{k} \cdots E_{1}C $, where $E_i$ are some elementary row matrices (not neccesarily the same matrices from before). Proof: Multiply by $E'_1 E'_2 \cdots E'_k$ from the left.

Claim 2: if $AB=I$, then this $C$ is the identity. Proof: By using determinant this is fairly easy. The condition $AB=I$ ensures $\det(C) \neq 0$ by multiplicativity of determinant, and since $C$ is upper triangular with every leading coefficient being 1 (the only nonzero entry in its column), the determinant is non-zero iff $C=I$. This can also be proved without determinants, but it is not the main focus of the proof (but it is an important claim nonetheless).

So we showed that $A=E_{k} E_{k-1} \cdots E_{1}$.

Claim 3: If $k=0$ or $1$ ($k$ is the number of row operations required to reduce $A$), then $AB=I \implies BA=I$. Proof: If $k=0$, $A=I \implies B=I \implies BA=I$. If $k=1$, $A$ is an elementary row matrix: $A=E$, so $AB=EB=I \implies B = E'EB = E' \implies BA=E'E=I$.

Claim 4: $AB = I \implies BA = I$. I will induct on $k$. Let's assume this is true while $k \le n$, $n \ge 1$. I am going to prove this for $k=n+1$. Let $A=E_{n+1}E_{n} \cdots E_{1}$.

$$ AB = E_{n+1} \cdots E_{1} B = I \implies E_{n} \cdots E_{1} B = E'_{n+1} \implies $$ $$ (E_{n} \cdots E_{1}) (B E_{n+1})= I \implies (B E_{n+1})(E_{n} \cdots E_{1}) = I \implies $$ $$ BA= I$$

  • 5,997
  • 24
  • 47
  • 2
    This is nice! However, it appears you can avoid Claims 3 and 4 (and, in particular, the induction): Since $A$ is a product of elementary matrices, $A$ is invertible, so you can just conjugate $AB = I$ by $A$, to get $BA = A^{-1}(AB)A = I$. :) – Andrew D. Hwang Feb 12 '15 at 22:04

You might have a look at the following note "Right inverse implies left inverse and vice versa": http://www.lehigh.edu/~gi02/m242/08m242lrinv.pdf

  • 770
  • 4
  • 7
  • 3
    That paper invokes LU-factorization, which is quite a sledgehammer for a result that is really nothing but a pigeonhole squeezing argument (see my post). – Bill Dubuque Sep 02 '10 at 14:55
  • 7
    Bill: LU decomposition is merely Gaussian elimination, rearranged, and I would suppose equivalent to row reduction, which the OP says he can use. – J. M. ain't a mathematician Sep 02 '10 at 17:11
  • @J.M. That doesn't stop it from being a sledgehammer compared to some other proofs given here. Not only are they simpler but more conceptual. I think the OP could easily understand at least one of those proofs. – Bill Dubuque Sep 03 '10 at 18:13

First some notation, given a $n\times n$ matrix $M$ write $\mathbf m_j$ for the $j$-th column of $M$. In what follows $E$ is the identity $n\times n$ matrix, so it's clear what the $\mathbf e_1,\ldots,\mathbf e_n$ are.

Assume $AB = E$. Consider the map $\mathbf x\mapsto B\mathbf x$. If $\mathbf x$ and $\mathbf x'$ are $n\times 1$ matrices, then $B\mathbf x = B\mathbf x'$ implies \begin{align*} AB \mathbf x &= AB \mathbf x' \\ E \mathbf x &= E \mathbf x' \\ \mathbf x &= \mathbf x', \end{align*} that is, left multiplication by $B$ is an injective map. By the rank-nullity theorem, this map is also onto, then for each matrix $n\times 1$ $\mathbf{y}$, the system $$B\mathbf x = \mathbf y$$ has exactly one solution, namely $$\mathbf x = A\mathbf y.$$

Particularly, for each $j\in \{1,\ldots,n\}$ the system $$B\mathbf x = \mathbf e_j$$ has exactly one solution, namely $$\mathbf x = A\mathbf e_j = \mathbf a_j.$$ Thus $$B\mathbf a_j = \mathbf e_j$$ for each $j\in \{1,\ldots,n\}$.

Write $C = BA$. Since $$\mathbf c_j = B\mathbf a_j = \mathbf e_j$$ for each $j\in \{1,\ldots,n\}$, $C$ and the identity matrix both have the same columns, so $$C = E.$$

  • 9,743
  • 6
  • 47
  • 118

Given a matrix $A$, apply row operations to transform it to reduced form $A'$. Each row operation is left multiplication by a (two-sided) invertible matrix (as noted in Ofir's answer), so there is an invertible matrix $C$ such that $CA=A'$.

Since $A$ is a square matrix, either $A'=I$, or the last row of $A'$ is completely zero. In the first case, $CA=I$, and since $C$ is invertible, $A$ is also. In the second case, $A'$ is not right invertible, since the last row of $A'D$ is completely zero for any $D$. Therefore $A$ cannot be right invertible, since if $AB=I$, then $A'(BC^{-1})=CABC^{-1}=CC^{-1}=I$.

Conclusion: if $A$ has a right inverse $B$, we must be in the first case, so $A$ is (two-sided) invertible.

P.S. I wouldn't be thinking about the proof of this fact if I am not teaching Finite Math...

Junyan Xu
  • 567
  • 4
  • 12
  • Beautiful. I think this is probably the most elementary proof possible, which uses only row operations and matrices. – ashpool Mar 04 '20 at 07:29

here is a try. we will use AB = I to show that the columns of $B$ are linearly independent and then use that to show $BA$ is identity on the range of $B$ which is all of the space due to linear independence of the columns of $B$. This implies that $BA$ is identity. linear independence of the columns of follows if we can show Bx = 0 implies x = 0. assume $Bx = 0, x = Ix = ABx = A0 = 0$. now to show that $BA$ is an identity on range of $B$ we have $(BA)Bx = B(AB)x = Bx $ and and we are done.


An alternative.

Let $\textbf{A}$ be a $n \times n$ matrix, and let $\lambda_k$ be the eigenvalues, then we can write

$$ \prod_{k=1}^n \Big( \textbf{A} - \lambda_k \textbf{I} \Big) = 0 $$

So we obtain

$$ \textbf{A}^n = a_0 \textbf{I} + \sum_{k=1}^{n-1} a_k \textbf{A}^k $$

Let us write

$$ \textbf{B} = b_0 \textbf{I} + \sum_{k=1}^{n-1} b_k \textbf{A}^k$$

It is clear that

$$ \textbf{A} \textbf{B} = \textbf{B} \textbf{A} $$

We also find that

$$ \begin{eqnarray} \textbf{A} \textbf{B} &=& b_0 \textbf{A} + \sum_{k=1}^{n-1} b_k \textbf{A}^{k+1}\\ &=& b_0 \textbf{A} + \sum_{k=1}^{n-2} b_k \textbf{A}^{k+1} + b_{n-1} \textbf{A}^{n}\\ &=& b_0 \textbf{A} + \sum_{k=1}^{n-2} b_k \textbf{A}^{k+1} + b_{n-1} a_0 \textbf{I} + b_{n-1} \sum_{k=1}^{n-1} a_k \textbf{A}^k\\ &=& b_{n-1} a_0 \textbf{I} + \Big( b_0 + b_{n-1} a_1 \Big) \textbf{A} + \sum_{k=2}^{n-1} \Big( b_{k-1} + b_{n-1} a_k \Big) \textbf{A}^k\\ \end{eqnarray} $$

When we set (for $a_0 \ne 0$)

$$ \begin{eqnarray} b_{n-1} a_0 &=& 1\\ b_{k-1} + b_{n-1} a_k &=& 0 \end{eqnarray} $$

we obtain

$$ \textbf{A} \textbf{B} = \textbf{I} $$

So we can write

$$ \textbf{B} = -a_0^{-1} a_1 \textbf{I} - \sum_{k=1}^{n-2} a_0^{-1} a_{k+1} \textbf{A}^k + a_0^{-1} \textbf{A}^{n-1} $$


$$ \textbf{A}^{-1} = -a_0^{-1} \left( a_1 \textbf{I} + \sum_{k=1}^{n-2} a_{k+1} \textbf{A}^k - \textbf{A}^{n-1} \right) $$

the inverse of $\textbf{A}$ can be expressed as a linear sum of $\textbf{A}^k$.

  • 6,147
  • 17
  • 20

The crucial fact is to show: $T\text{ injective} \iff T\text{ surjective}$
(That is hidden by saying: $A B=\mathbb{1}\iff B A=\mathbb{1}$)

But it holds in general: $\dim\mathcal{D}(T)=\dim\mathcal{N}(T)+\dim\mathcal{R}(T)$
So the above assertion holds for finite dimensional spaces.

To get a grasp why the assertions are actually the same think of the following:
As you might know: $f\circ g=id\Rightarrow f\text{ surjective}(g\text{ injective})$
And similarly: $g\circ f=id\Rightarrow f\text{ injective}(g\text{ surjective})$
Conversely: $f\text{ injective}\Rightarrow g_0\circ f=id$ ...for some $g_0$
And similarly: $f\text{ surjective}\Rightarrow f\circ g_0=id$ ...for some $g_0$
Hoewever, the constructions will be more tedious...

  • 15,239
  • 4
  • 33
  • 104

$AB=I$ => dim of row space of $B =n.$ Now B can be written as $B=BI=B(AB).$ i.e $(I-BA)B=O=> I-BA =O $....(1).

In (1) I am using the fact that all the rows of $B$ are linearly independent and the rows of the matrix $(I-BA)B$ are linear combinations of the rows of $B$ and the coefficients comes from the rows of $(I-BA).$ So as $(I-BA)B=O$ so the coefficients must be all $0$ i.e all rows of $(I-BA)$ are $0$.

  • 159
  • 2
  • 6

The following proof appears in the article A Short and Elementary Proof of the Two-sidedness of the Matrix-Inverse [Paparella, P., College Math. J. 48 (2017), no. 5, 366-367; MR3723713] and only uses the criteria stipulated by the OP:

enter image description here

Pietro Paparella
  • 3,066
  • 1
  • 15
  • 27

A simple solution: $Rank (AB) \leq Rank(B).$ Since the each row of $AB$ can be written as the linear combination of rows $B.$ since $AB=I_n$ this implies $Rank(B)=n$ thus $B$ is invertible. Now by multiplying $B^{-1}$ to the both sides of $AB=I_n$ (From right) you will get $A=B^{-1}$ this implies $BA=I_{n}$

Red shoes
  • 6,433
  • 1
  • 13
  • 27

Here is an alternative approach. Assume $AB = I$ and let $x \in Im(B)$. Then there exists $y$ such that $By = x$, so it holds that $BAx = B(ABy) = By = x$. If the image of $B$ coincides with the domain of $A$, then the previous reasoning implies that $B$ is a left-inverse of $A$.

The advantage of this approach is that it directly shows what can go wrong in infinite dimensional spaces. If $A$ and $B$ are the left and right shift operators on $\ell^2$ (where the first element is set to 0 for the right shift), for example, then clearly $AB = I$ but $BA x \neq x$ for any $x$ that is not in $Im(B)$ i.e. for any $x$ whose first element is nonzero.

Roberto Rastapopoulos
  • 1,816
  • 10
  • 26

Suppose $AB=I$.

$\left\{A^j\right\}_{j=0}^{n^2}$ can not be linearly independent as vectors in $\mathbb{R}^{n\times n}$ (or $\mathbb{C}^{n\times n}$), so there is a non-trivial linear combination of the $A^j$ that vanishes. That is, $$ \sum_{j=k}^mc_jA^j=0\tag1 $$ where $0\le k\le m\le n^2$ and $c_k\ne0$. Right multiply $(1)$ by $B^{k+1}$ to get $$ \sum_{j=k+1}^mc_jA^{j-k-1}+c_kB=0\tag2 $$ Left multiply $(2)$ by $A$ to get $$ \sum_{j=k+1}^mc_jA^{j-k}+c_kI=0\tag3 $$ Right multiply $(2)$ by $A$ to get $$ \sum_{j=k+1}^mc_jA^{j-k}+c_kBA=0\tag4 $$ Comparing $(3)$ and $(4)$, we get $$ BA=I\tag5 $$

  • 326,069
  • 34
  • 421
  • 800

It's easy to see that for square matrices $A$ and $B$, $AB = I$ is equivalent to $ABs_n = s_n$ where $\{s_n\}$ is a basis for $R^n$.

Define $g_n = Bs_n$. We observe the following:

  1. $\{g_n\}$ is a basis for $R^n$, since any $s_n$ is a linear combination of $\{g_n\}$ (i.e., $s_n = ABs_n = Ag_n$)
  2. $BAg_n = BABs_n = Bs_n=g_n$

Conclusion: Since $\{g_n\}$ is a basis, $BAg_n = g_n$ is equivalent to $BA = I$.

  • 419
  • 5
  • 8

$AB=I$, hence $A$ is full rank and we can perform elementary column transformation to transform $A$ into $I$.

Therefore, we can perform elementary row transformation to transform $A$ into $I$. Hence there exists a full rank matrix $C$ s.t. $CA=I$.


Similar to @Blue 's answer. The argument for $C$'s existence is simplified.

  • 451
  • 3
  • 9

Since inverse/transpose are not allowed we start by writing $$A = \begin{bmatrix} a_{11} & a_{12} & a_{13} & \dots & a_{1n} \\ a_{21} & a_{22} & a_{23} & \dots & a_{2n} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ a_{n1} & a_{n2} & a_{n3} & \dots & a_{nn} \end{bmatrix}$$ and similarly

$$B = \begin{bmatrix} b_{11} & b_{12} & b_{13} & \dots & b_{1n} \\ b_{21} & b_{22} & b_{23} & \dots & b_{2n} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ b_{n1} & b_{n2} & b_{n3} & \dots & b_{nn} \end{bmatrix}$$ Since $AB = I$, using matrix multiplication definition we can write the elements of AB as: $$ \sum_{i,i'} \sum_{j,j'} a_{i,i'}b_{j,j'} = I_{i,j'}= \begin{cases} 1, & \text{if}\ i=j'\space and \space j=i' \\ 0, & \text{otherwise} \end{cases} $$ such that when $i = j'$ and $j = i'$ we get the diagonal elements and off diagonal otherwise. But note that since $a_{i,i'}b_{j,j'}$ are scalar and commute we can write. $$ \sum_{i,i'} \sum_{j,j'} b_{j,j'}a_{i,i'} = \sum_{i,i'} \sum_{j,j'} a_{i,i'}b_{j,j'} = \begin{cases} 1, & \text{if}\ i=j'\space and \space j=i' \\ 0, & \text{otherwise} \end{cases} $$

Now we observe that the elements of $BA$ can be written as... $$ \sum_{i,i'} \sum_{j,j'} b_{j,j'}a_{i,i'} $$ such that when $i = j'$ and $j = i'$ we get the diagonal elements and off diagonal otherwise. But we showed that $$ \sum_{i,i'} \sum_{j,j'} b_{j,j'}a_{i,i'} = \begin{cases} 1, & \text{if}\ i=j'\space and \space j=i' \\ 0, & \text{otherwise} \end{cases} $$ thus $BA = I$

Sason Torosean
  • 304
  • 2
  • 5

It is possible to give a very elementary short proof by combining the arguments of Davidac897, Bill Dubuque and Mohammed, but without using linear functions, images, injective maps or relatively advanced basis notions.

Take the canonical base $\{e_1,e_2,\dots,e_n\}$ where $e_i=[0~0~\dots 1~0~0\dots 0]^\top$ with the value one appearing at position $i$, for all $i\in[1..n]$. Assume $Be_1,~Be_2,\dots Be_n$ are linearly dependent, i.e., there exists a non-zero vector $\alpha\in R^n$ such that $\sum_{i=1}^n \alpha_i Be_i=0$. Multiplying this by $A$, we obtain $A\left(\sum_{i=1}^n \alpha_i Be_i\right)=0$, which is equivalent to $\sum_{i=1}^n\alpha_i ABe_i=0$ or $\sum_{i=1}^n \alpha_ie_i=0$, which is a contradiction for non-zero $\alpha$. The assumption that $Be_1,~Be_2,\dots Be_n$ are linearly dependent is false. These vectors need to be linearly independent, and so, their linear combinations cover a space of dimension $n$, i.e., $R^n$. This means that for any $x\in R^n$ there exist a linear combination $x=\sum_{i=1}^n \beta_i Be_i= B(\sum_{i=1}^n \beta_ie_i)=By$.

From $x=By$ we derive $x=BABy=BAx$. Since this holds for any $x\in R^n$, we can extend it to a matrix version: $X=BAX$ for all $X\in R^{n\times n}$, and so, $BA$ needs to be the identity matrix $I_n$.

Daniel Porumbel
  • 349
  • 3
  • 4

$\newcommand{\mat}[1]{\left(\begin{matrix}#1\end{matrix}\right)}$ Here is a proof using only calculations and induction over the size $n$ of the matrices. Observe that no commutativity is needed; we can work in any division ring.
If $n=1$ then two scalars $a,b$ with $ab=1$ are given. Then $b=1/a$ and we also have $ba=1$.
Now suppose that the statement is true for all matrices of size $n-1$, $n\geq2$. Given two $n$ by $n$ matrices with $AB=I$, we can assume without loss of generality that the upper left element of $A$ is nonzero. Otherwise, since the first row of $A$ cannot be all zero, we can achieve this by permuting two columns of $A$ and the correponding rows of $B$. We can also assume that this upper left element $\alpha$ equals 1, otherwise we multiply the first row of $A$ by the inverse of this element from the left and the first column of $B$ by $\alpha$ from the right.
Now we write $A,B$ in block matrix form: $$A=\mat{1&a_2\\a_3&a_4}, B=\mat{b_1&b_2\\b_3&b_4},$$ where $b_1$ is a scalar, $a_2,b_2$ are matrices of size 1 by $n-1$, $a_3,b_3$ have size $n-1$ by 1 and $a_4,b_4$ are $n-1$ by $n-1$. $AB=I$ means that $$b_1+a_2b_3=1,\ b_2+a_2b_4=0,\ a_3b_1+a_4 b_3=0\mbox{ and }a_3b_2+a_4b_4=I.$$ Here $I$ has size $n-1$ only and "0" abbreviates any matrix of zeros).
First, we calculate $(a_4-a_3a_2)b_4=a_4b_4+a_3b_2=I$. Since both matrices have size $n-1$ we can conclude that $b_4(a_4-a_3a_2)=I$.
Next, we calculate $$(a_4-a_3a_2)(b_3+b_4 a_3)=a_4 b_3-a_3a_2b_3+a_3= (a_4 b_3+a_3b_1)+a_3(-b_1-a_2b_3+1)=0$$ and, multiplying by $b_4$ from the left, we obtain $b_3+b_4 a_3=0$.
Then we obtain that $a_2b_3=-a_2b_4a_3=b_2a_3$ and therefore also $b_1+b_2a_3=1$.
Finally we have $a_2+b_2(a_4-a_3a_2)=(a_2b_4+b_2)(a_4-a_3a_2)=0$ and thus $b_2a_4=(b_2a_3-1)a_2=-b_1a_2$. Altogether we obtain $BA=\mat{b_1&b_2\\b_3&b_4}\mat{1&a_2\\a_3&a_4}=I$ which completes the proof.

  • 4,554
  • 1
  • 9
  • 30

Let us give a more general result:

Result: Let $T:V\rightarrow V$ be a linear operator on a finite dimensional vector space $V$ such that there is another linear operator $U$ on $V$ satisfying $TU=I$. Then show that, $UT=I$.

Proof: Since $TU=I$, so $T$ is onto. Now $V$ is finite dimensional, so from Rank-Nullity theorem $\operatorname{Nullity}(T)=0$, which implies $\operatorname{Ker}(T)=\{0\}$.

Now for any $v\in V$ we have, $T(UT-I)v=(TU-I)Tv=0$. So $(UT-I)v\in \operatorname{Ker}(T)$ for all $v\in V$, which gives $(UT-I)v=0$ for all $v\in V$. Thus, $UT-I=0$.

Note: Finiteness of $\operatorname{dim}(V)$ is necessary. Otherwise, "$T$ is onto $\Rightarrow \operatorname{Ker}(T)=\{0\}$" does not hold in an infinite dimensional vector space.

Now a square matrix $A$ of size $n$ over a field $F$ can be treated as a linear operator on $F^n$ by following linear map $L_A: F^n\rightarrow F^n$, defined by $L_A(v)=Av$ for all column vector $v\in F^n$.

For two $n\times n$ matrix $A$, $B$, we clearly have $L_AL_B=L_{AB}$. Now put $T=L_A$ and $U=L_A$ then the required result follows.



If $A$ and $B$ are two square matrices such that $A B=I$, then $B A=I$.


By applying the properties of determinants of square matrices, we get that

$\det A\cdot\det B=\det(A B)=\det I=1\ne0$.

So it results that $\;\det A\ne0\;$ and $\;\det B\ne0\;$.

Now we consider the matrix

$C=\frac{1}{\det A}\cdot\text{adj}(A)$

where $\;\text{adj}(A)\;$ is the adjugate matrix of $A$ which is the transpose of the cofactor matrix of $A$.

$\text{adj}(A)= \begin{pmatrix} A_{1,1} & A_{2,1} & \cdots & A_{n,1} \\ A_{1,2} & A_{2,2} & \cdots & A_{n,2} \\ \vdots & \vdots & \ddots & \vdots \\ A_{1,n} & A_{2,n} & \cdots & A_{n,n} \end{pmatrix}$

where $\;A_{i,j}\;$ is the cofactor of the element $a_{i,j}$ of the matrix $A$.

So $\;A_{i,j}=(-1)^{i+j}\det M_{i,j}$

where $\;M_{i,j}\;$ is the submatrix of $A$ formed by deleting the $i^{th}$ row and the $j^{th}$ column.

We are going to use the Laplace expansions which are the following equalities:

$a_{i,1}A_{j,1}+a_{i,2}A_{j,2}+\ldots+a_{i,n}A_{j,n}= \begin{cases} \det A\;,\quad\text{ if } i=j\\ 0\;,\quad\quad\;\,\text{ if } i\ne j \end{cases}$

$A_{1,i}a_{1,j}+A_{2,i}a_{2,j} +\ldots+A_{n,i}a_{n,j}= \begin{cases} \det A\;,\quad\text{ if } i=j\\ 0\;,\quad\quad\;\,\text{ if } i\ne j \end{cases}$

By applying Laplace expansions, we get that

$A C = C A = I$.

Since for hypothesis $\;A B =I\;,\;$ then $\;C(A B)= C I\;,\;$ hence $\;(C A)B=C,\;$ therefore $\;I B = C\;$ and so we get that $\;B=C$.


$B A = C A = I\;,$

so we have proved that

$B A = I$.

  • 1
  • 3
  • 4
  • 21

Let $EA$ be the RREF of $A,$ where $E$ is a suitable product of elementary matrices. If $A$ is not invertible, then $EA$ has a zero row. Then $EAB$ also has a zero row. However, $EAB=E$ does not have a zero row. Thus $A$ is invertible. Consequently, $B=A^{-1}$ is also invertible, and $BA=I.$

  • 349
  • 2
  • 6

The OP from 10 years ago said he understood

1.Elementary row operations.
2.Linear dependence.
3.Row reduced forms and their relations with the original matrix.

I'll use only a bit more than (1) and (3). Each elementary row operation corresponds to left multiplication by an elementary matrix and every elementary matrix is invertible. I won't use (2) but I will use

(4) If the rref of square matrix has a zero row, then it does not have a left inverse.

Proof. Let square $H$ have rref $R$ whose last row is zero. Setting a parameter equal to 1 and using $R,$ there is a nonzero column vector $X$ such that $HX={\bf 0}.$ Suppose $KH=I.$ Then $$ X=IX=KHX=K(HX)=K{\bf{0}}={\bf 0},\ \text{a contradiction.}$$

Let $AB=I.$ By (4), $I$ is the rref of $B.$ Therefore, $CB=I,$ where $C$ is a product of elementary matrices. By taking, in reverse order, the product of the inverses of these elementary matrices, it follows that $C$ is invertible. $$C^{-1}=C^{-1}(CB)=IB=B$$ so that $BC=I.\tag 1$ $$A=AC^{-1}C=(AB)C=IC=C\tag 2 $$ Combining (1) and (2), we conclude that $BA=I.$

(I just noticed the P.S. of the OP. Sorry. Also, there's probably overlap with earlier proofs. I just wanted to give a short, simple proof that did not use any concepts from vector spaces.)

  • 29
  • 1

I wanted a simpler and clearer proof, which is as follows:

Prop. Let $A$ and $B$ be real square matrices of size $n$. If $AB = I$, then $BA=I$.

Notation. Let $L_A : \mathbb{R}^n \to \mathbb{R}^n$ be a linear transformation such that $L_A(x) = Ax$. We treat $L_A$ and $A$ equally for simplicity. $e_i$ is the $i$-th standard basis element. The span of (column) vectors $b_1,\dots,b_n$ is denoted by $\text{span}(b_i) = \{\sum_{i=1}^n c_i b_i : c_i \in \mathbb{R}^n\}$.

Proof. Denote $B=[b_1 \cdots b_n]$ with columns $b_i$. Then $b_i$ are indepedent since if $$ 0 =\sum_i c_ib_i $$ then $$ 0 = \sum_i c_i A b_i = \sum_i c_i e_i $$ gives $c_i=0$ as $e_i$ are independent and $AB=I$ means $e_i = Ab_i$ for all $i$. Thus, $b_i$ spans $\mathbb{R}^n = \text{span}(b_i)$. Hence, the range of $A$ is equal to $$ A(\mathbb{R}^n) = \{ \sum_{i=1}^n c_iAb_i : c_i \in \mathbb{R} \} = \{ \sum_{i=1}^n c_ie_i : c_i \in \mathbb{R} \} = \mathbb{R}^n. $$ Hence, due to Rank-Nullity Theorem, $\ker A = \{0\}$.

Now, rearranging $$ A = IA = (AB)A = A(BA) $$ gives $0 = A(I - BA)$. Denote $I-BA = [z_1 \cdots z_n]$ with columns $z_i$. Since $\ker A = \{0\}$, we have $Az_i = 0$ for all $i$, proving $I - BA= 0$; i.e., $BA = I$. This completes the proof.

Note. This answer is a clearer version of the selected.

Note. Rank-Nullity Theorem is proved purely by linear dipendancy and/or row operation/reduced echelon form. It states that $$ \dim \ker A + \dim A(\mathbb{R}^n) = n. $$

  • 2,876
  • 2
  • 17
  • 32

Since matrices represent linear maps between finite-dimensional vector spaces, the question can be recast as:

Let $K$ be a field. Prove that if $f:K^n\to K^n$ and $g:K^n\to K^n$ are linear maps, and $f\circ g=\DeclareMathOperator{\id}{id}\id$, then $g\circ f=\id$.

Since $f\circ g=\id$, the function $g$ has a left-inverse, and so it is injective. Now, the rank-nullity theorem asserts that $$ \dim(\operatorname{im} g)+\dim(\operatorname{ker} g)=n \, . $$ As $g$ is injective, its kernel is $\{\mathbf{0}\}$, which has dimension $0$. Hence, $\dim(\operatorname{im} g)=n$. If a linear subspace $U$ of $V$ has the same dimension as $V$, then $U=V$. Since the image of $g$ is a linear subspace of $K^n$, it follows that the image of $g$ equals $K^n$, and so $g$ is surjective. Hence, $g$ is bijective, with $f$ being its unique two-sided inverse.

  • 14,185
  • 2
  • 28
  • 65