70

Given two square matrices $A$ and $B$, how do you show that $$\det(AB) = \det(A)\det(B)$$ where $\det(\cdot)$ is the determinant of the matrix?

Rodrigo de Azevedo
  • 18,977
  • 5
  • 36
  • 95
Learner
  • 2,476
  • 1
  • 24
  • 36
  • 10
    The proof is not given in your text book? – Quixotic Aug 28 '11 at 06:29
  • 14
    Hint: Show that the formula holds, if $A$ is an elementary matrix. By induction show that it holds if $A$ is a product of elementary matrices. Which cases are not covered by this? Study them separately. – Jyrki Lahtonen Aug 28 '11 at 06:34
  • 12
    @Learner: How do you define the determinant of a matrix? The definition affects what properties we may assume in the proof. – Zhen Lin Aug 28 '11 at 07:00
  • 1
    Well searching on Google gives me this link: - http://www.math.osu.edu/~husen/teaching/571/2_2.pdf –  Aug 28 '11 at 06:32

11 Answers11

52

Let's consider the function $B\mapsto \det(AB)$ as a function of the columns of $B=\left(v_1|\cdots |v_i| \cdots | v_n\right)$. It is straight forward to verify that this map is multilinear, in the sense that $$\det\left(A\left(v_1|\cdots |v_i+av_i'| \cdots | v_n\right)\right)=\det\left(A\left(v_1|\cdots |v_i| \cdots | v_n\right)\right)+a\det\left(A\left(v_1|\cdots |v_i'| \cdots | v_n\right)\right).$$ It is also alternating, in the sense that if you swap two columns of $B$, you multiply your overall result by $-1$. These properties both follow directly from the corresponding properties for the function $A\mapsto \det(A)$.

The determinant is completely characterized by these two properties, and the fact that $\det(I)=1$. Moreover, any function that satisfies these two properties must be a multiple of the determinant. If you have not seen this fact, you should try to prove it. I don't know of a reference online, but I know it is contained in Bretscher's linear algebra book.

In any case, because of this fact, we must have that $\det(AB)=c\det(B)$ for some constant $c$, and setting $B=I$, we see that $c=\det(A)$.


For completeness, here is a proof of the necessary lemma that any a multilinear, alternating function is a multiple of determinant.

We will let $f:\mathbb (F^n)^n\to \mathbb F$ be a multilinear, alternating function, where, to allow for this proof to work in characteristic 2, we will say that a multilinear function is alternating if it is zero when two of its inputs are equal (this is equivalent to getting a sign when you swap two inputs everywhere except characteristic 2). Let $e_1, \ldots, e_n$ be the standard basis vectors. Then $f(e_{i_1},e_{i_2}, \ldots, e_{i_n})=0$ if any index occurs twice, and otherwise, if $\sigma\in S_n$ is a permutation, then $f(e_{\sigma(1)}, e_{\sigma(2)},\ldots, e_{\sigma(n)})=(-1)^\sigma$, the sign of the permutation $\sigma$.

Using multilinearity, one can expand out evaluating $f$ on a collection of vectors written in terms of the basis:

$$f\left(\sum_{j_1=1}^n a_{1j_1}e_{j_1}, \sum_{j_2=1}^n a_{2j_2}e_{j_2},\ldots, \sum_{j_n=1}^n a_{nj_n}e_{j_n}\right) = \sum_{j_1=1}^n\sum_{j_2=1}^n\cdots \sum_{j_n=1}^n \left(\prod_{k=1}^n a_{kj_k}\right)f(e_{j_1},e_{j_2},\ldots, e_{j_n}).$$

All the terms with $j_{\ell}=j_{\ell'}$ for some $\ell\neq \ell'$ will vanish before the $f$ term is zero, and the other terms can be written in terms of permutations. If $j_{\ell}\neq j_{\ell'}$ for any $\ell\neq \ell'$, then there is a unique permutation $\sigma$ with $j_k=\sigma(k)$ for every $k$. This yields:

$$\begin{align}\sum_{j_1=1}^n\sum_{j_2=1}^n\cdots \sum_{j_n=1}^n \left(\prod_{k=1}^n a_{kj_k}\right)f(e_{j_1},e_{j_2},\ldots, e_{j_n}) &= \sum_{\sigma\in S_n} \left(\prod_{k=1}^n a_{k\sigma(k)}\right)f(e_{\sigma(1)},e_{\sigma(2)},\ldots, e_{\sigma(n)}) \\ &= \sum_{\sigma\in S_n} (-1)^{\sigma}\left(\prod_{k=1}^n a_{k\sigma(k)}\right)f(e_{1},e_{2},\ldots, e_{n}) \\ &= f(e_{1},e_{2},\ldots, e_{n}) \sum_{\sigma\in S_n} (-1)^{\sigma}\left(\prod_{k=1}^n a_{k\sigma(k)}\right). \end{align} $$

In the last line, the thing still in the sum is the determinant, although one does not need to realize this fact, as we have shown that $f$ is completely determined by $f(e_1,\ldots, e_n)$, and we simply define $\det$ to be such a function with $\det(e_1,\ldots, e_n)=1$.

Aaron
  • 21,774
  • 2
  • 39
  • 67
  • 2
    +1. Nice answer! I think the definition of "alternating" which makes your argument work (and which is the classic one) is "$f(B)$ is alternating in $B$ if $f(B)=0$ whenever two columns of $B$ coincide". Of course if $2$ is invertible in the ground ring, this is equivalent to your formulation. – Pierre-Yves Gaillard Aug 28 '11 at 14:08
  • 1
    I am used to seeing multilinear as $$\det\left(A\left(v_1|\cdots |av_i+a'v_i'| \cdots | v_n\right)\right)=a\det\left(A\left(v_1|\cdots |v_i| \cdots | v_n\right)\right)+a'\det\left(A\left(v_1|\cdots |v_i'| \cdots | v_n\right)\right)$$ However, that version can be derived from the version above by noting that $$\det\left(A\left(v_1|\cdots |v_i+(a-1)v_i| \cdots | v_n\right)\right)=\det\left(A\left(v_1|\cdots |v_i| \cdots | v_n\right)\right)+(a-1)\det\left(A\left(v_1|\cdots |v_i| \cdots | v_n\right)\right)$$ – robjohn Aug 28 '11 at 16:16
  • @Pierre-Yves Gaillar: Yes, in characteristic 2, your definition of alternating is the right one, but everywhere else, this equivalent formulation is easier to work with, at least if you want to work algorithmicly/constructively. I made the tacit assumption that if you want to know the answer to this question, you probably are working in characteristic zero. If you have algebraic background, a cleaner proof is to take the top wedge of a linear map, recognize $\det$ as the corresponding linear transform between one dimensional spaces, and use functoriality of wedge products. – Aaron Aug 28 '11 at 20:19
  • @robjohn I usually think of (multi)linearity as two conditions, compatibility with sums and compatibility with scalar multiplication. Here, setting $v_i=0$ gives scalar multiplication, and $a=0$ gives sums. Your version is more symmetric, mine is just slightly terser (and superficially, but not actually simpler). – Aaron Aug 28 '11 at 20:23
  • @Aaron: setting $v_i=0$ doesn't give scalar multiplication without first having that a $0$ column implies a $0$ determinant. That's why I had to do the trick with $v_i+(a-1)v_i$ to get $av_i$ on the left and $\det(A)+(a-1)\det(A)=a\det(A)$ on the right. – robjohn Aug 29 '11 at 13:07
  • @robjohn you can get $0$ by setting $a=0$ first. It's all there. I swear. No tricks are required. – Aaron Aug 29 '11 at 14:55
  • @Aaron: setting $a=0$ only gives you$$\det\left(A\left(v_1|\cdots |v_i| \cdots | v_n\right)\right)=\det\left(A\left(v_1|\cdots |v_i| \cdots | v_n\right)\right)$$which is trivial, but does not show that a $0$ column produces a $0$ determinant. – robjohn Aug 29 '11 at 18:13
  • @robjohn Sorry, typo, meant $v_i'=0$ – Aaron Aug 31 '11 at 20:06
  • @Aaron: Okay, setting $v_i'=0$ and $a=1$, shows that a $0$ column produces a $0$ determinant, then you can set $v_i=0$ to show that multiplying a column by a constant multiplies the determinant by that same constant. However, I was saying that setting $v_i'=v_i$ and $a=k-1$ shows that multiplying a column by $k$ multiplies the determinant by $k$ all in one step. – robjohn Aug 31 '11 at 20:40
  • I couldn't find anything related to your statement "any function that satisfies these two properties must be a multiple of the determinant" in Bretscher's linear algebra book. Also just for clarification, this statement is claiming something different than the fact that "any multilinear, alternating function with f(I) = 1 has to be the determinant" right? Your statement claims that a function which takes matrix B as input, and has the multilinearity and alternating property, has to be a multiple of det(B). please confirm this and also guide me to a proof of the same, i would be very grateful! – Rishabh Gupta Nov 17 '21 at 10:31
  • 1
    @RishabhGupta Any function that is multi linear and alternating must satisfy f(A)=f(I)det(A). This is because multilinearity let’s you express things in terms of f evaluated on matrices where columns are standard basis vectors, and alternating let’s you sort the columns. Unless they changed things between editions, this lemma is what Bretscher used to prove the multiplicative property. But if you have a copy of the book, you can just look at his presentation of the subject, which is a fleshed out version of what I have. – Aaron Nov 17 '21 at 11:47
35

The proof using elementary matrices can be found e.g. on proofwiki. It's basically the same proof as given in Jyrki Lahtonen 's comment and Chandrasekhar's link.

There is also a proof using block matrices, I googled a bit and I was only able to find it in this book and this paper.


I like the approach which I learned from Sheldon Axler's Linear Algebra Done Right, Theorem 10.31. Let me try to reproduce the proof here.

We will use several results in the proof, one of them is - as far as I can say - a little less known. It is the theorem which says, that if I have two matrices $A$ and $B$, which only differ in $k$-th row and other rows are the same, and the matrix $C$ has as the $k$-th row the sum of $k$-th rows of $A$ and $B$ and other rows are the same as in $A$ and $B$, then $|C|=|B|+|A|$.

Geometrically, this corresponds to adding two parallelepipeds with the same base.


Proof. Let us denote the rows of $A$ by $\vec\alpha_1,\ldots,\vec\alpha_n$. Thus $$A= \begin{pmatrix} a_{11} & a_{12}& \ldots & a_{1n}\\ a_{21} & a_{22}& \ldots & a_{2n}\\ \vdots & \vdots& \ddots & \vdots \\ a_{n1} & a_{n2}& \ldots & a_{nn} \end{pmatrix}= \begin{pmatrix} \vec\alpha_1 \\ \vec\alpha_2 \\ \vdots \\ \vec\alpha_n \end{pmatrix}$$

Directly from the definition of matrix product we can see that the rows of $A\cdot B$ are of the form $\vec\alpha_kB$, i.e., $$A\cdot B=\begin{pmatrix} \vec\alpha_1B \\ \vec\alpha_2B \\ \vdots \\ \vec\alpha_nB \end{pmatrix}$$ Since $\vec\alpha_k=\sum_{i=1}^n a_{ki}\vec e_i$, we can rewrite this equality as $$A\cdot B=\begin{pmatrix} \sum_{i_1=1}^n a_{1i_1}\vec e_{i_1} B\\ \vdots\\ \sum_{i_n=1}^n a_{ni_n}\vec e_{i_n} B \end{pmatrix}$$ Using the theorem on the sum of determinants multiple times we get $$ |{A\cdot B}|= \sum_{i_1=1}^n a_{1i_1} \begin{vmatrix} \vec e_{i_1}B\\ \sum_{i_2=1}^n a_{2i_2}\vec e_{i_2} B\\ \vdots\\ \sum_{i_n=1}^n a_{ni_n}\vec e_{i_n} B \end{vmatrix}= \ldots = \sum_{i_1=1}^n \ldots \sum_{i_n=1}^n a_{1i_1} a_{2i_2} \dots a_{ni_n} \begin{vmatrix} \vec e_{i_1} B \\ \vec e_{i_2} B \\ \vdots \\ \vec e_{i_n} B \end{vmatrix} $$

Now notice that if $i_j=i_k$ for some $j\ne k$, then the corresponding determinant in the above sum is zero (it has two identical rows). Thus the only nonzero summands are those one, for which the $n$-tuple $(i_1,i_2,\dots,i_n)$ represents a permutation of the numbers $1,\ldots,n$. Thus we get $$|{A\cdot B}|=\sum_{\varphi\in S_n} a_{1\varphi(1)} a_{2\varphi(2)} \dots a_{n\varphi(n)} \begin{vmatrix} \vec e_{\varphi(1)} B \\ \vec e_{\varphi(2)} B \\ \vdots \\ \vec e_{\varphi(n)} B \end{vmatrix}$$ (Here $S_n$ denotes the set of all permutations of $\{1,2,\dots,n\}$.) The matrix on the RHS of the above equality is the matrix $B$ with permuted rows. Using several transpositions of rows we can get the matrix $B$. We will show that this can be done using $i(\varphi)$ transpositions, where $i(\varphi)$ denotes the number of inversions of $\varphi$. Using this fact we get $$|{A\cdot B}|=\sum_{\varphi\in S_n} a_{1\varphi(1)} a_{2\varphi(2)} \dots a_{n\varphi(n)} (-1)^{i(\varphi)} |{B}| =|A|\cdot |B|.$$

It remains to show that we need $i(\varphi)$ transpositions. We can transform the "permuted matrix" to matrix $B$ as follows: we first move the first row of $B$ on the first place by exchanging it with the preceding row until it is on the correct position. (If it already is in the first position, we make no exchanges at all.) The number of transpositions we have used is exactly the number of inversions of $\varphi$ that contains the number 1. Now we can move the second row to the second place in the same way. We will use the same number of transposition as the number of inversions of $\varphi$ containing 2 but not containing 1. (Since the first row is already in place.) We continue in the same way. We see that by using this procedure we obtain the matrix $B$ after $i(\varphi)$ row transpositions.

PNT
  • 3,022
  • 6
  • 24
Martin Sleziak
  • 50,316
  • 18
  • 169
  • 342
  • 2
    Hi Martin, pheraps I'm missing something, but what you call “a little less known theorem” seems to me just the multilinearity of the determinant as a function of rows, isn't it? – pppqqq Jan 14 '14 at 12:17
  • 2
    @pppqqq Yes, that is correct. My experience is that people doing basic course in linear algebra are often not familiar in it. (I have seen some questions where it would be very natural to use this result in the solution; but people chose some different approach instead.) Also, someone who is learning determinants for the first time might not even know what the word *multilinearity* means. – Martin Sleziak Jan 14 '14 at 12:19
30

Let $K$ be the ground ring. The statement holds

(a) when $B$ is diagonal,

(b) when $B$ is strictly triangular,

(c) when $B$ is triangular (by (a) and (b)),

(d) when $A$ and $B$ have rational entries and $K$ is an extension of $\mathbb Q$ containing the eigenvalues of $B$ (by (c)),

(e) when $K=\mathbb Q$ (by (d)),

(f) when $K=\mathbb Z[a_{11},\dots,a_{nn},b_{11},\dots,b_{nn}]$, where the $a_{ij}$ and $b_{ij}$ are respectively the entries of $A$ and $B$, and are indeterminate (by (e)),

(g) always (by (f)).

The reader who knows what the discriminant of a polynomial in $\mathbb Q[X]$ is, can skip (b) and (c).

Reference: this MathOverflow answer of Bill Dubuque.

EDIT 1. The principle underlying the above argument has various names. Bill Dubuque calls it "universality" principle. Michael Artin calls it "The Principle of Permanence of Identities". The section of Algebra with this title can be viewed here. I strongly suggest reading this section to those who are not familiar with this. It is an interesting coincidence that the illustration chosen by Artin is precisely the multiplicativity of determinants.

Another highly important application is the proof of the Cayley-Hamilton Theorem. I will not give it here, but I will digress on another point. That is, I will try to explain why

(*) it suffices to prove Cayley-Hamilton or the multiplicativity of determinants in the diagonal case.

Suppose we have a polynomial map $f:M_n(\mathbb Z)\to\mathbb Z$. Then $f$ is given by a unique element, again denoted $f$, of $\mathbb Z[a_{11},\dots,a_{nn}]$, where the $a_{ij}$ are indeterminates (because $\mathbb Z$ is an infinite domain). As a result, given any $A$ in $M_n(K)$ for any commutative ring $K$, we can define $f_K(A)$ by mapping the indeterminate $a_{ij}$ to the corresponding entry of $A$. That is the Principle of Permanence of Identities. The key to prove (*) will be:

LEMMA 1. Let $f:M_n(\mathbb Z)\to\mathbb Z$ be a polynomial map vanishing on the diagonalizable matrices. Then $f$ vanishes on all matrices.

There are at least two ways to prove this. The reader will perhaps prefer the first one but, (IMHO) the second one is better.

First way: It suffices to prove that the polynomial map $f_{\mathbb C}:M_n(\mathbb C)\to\mathbb C$ is zero. Thus it suffices to prove that the diagonalizable matrices are dense in $M_n(\mathbb C)$. But this is clear since any $A\in M_n(\mathbb C)$ is similar to a triangular matrix $T$, and the diagonal entries of $T$ (which are the eigenvalues of $A$) can be made all distinct by adding an arbitrarily small diagonal matrix.

Second way. Consider again the ring $R:=\mathbb Z[a_{11},\dots,a_{nn}]$, where the $a_{ij}$ are indeterminates. Let $A$ in $M_n(R)$ be the matrix whose $(i,j)$ entry is $a_{ij}$. Let $\chi\in R[X]$ be the characteristic polynomial of $A$, and let $u_1,\dots,u_n$ be the roots of $\chi$ (in some extension of the fraction field of $R$).

LEMMA 2. The expression $$\prod_{i < j}\ (u_i-u_j)^2$$ defines a unique nonzero element of $d\in R$, called the discriminant of $\chi$.

Lemma 2 implies Lemma 1 because $R$ is a domain and because we have $fd=0$ since $f$ vanishes on the diagonalizable matrices, whereas $d$ vanishes on the non-diagonalizable matrices.

Lemma 2 is a particular case of a theorem which says that, given any monic polynomial $g$ in one indeterminate and coefficients in a field, any polynomial in the roots of $g$ which is invariant under permutation is a polynomial in the coefficients of $g$. More precisely:

Let $A$ be a commutative ring, let $X_1,\dots,X_n,T$ be indeterminates, and let $s_i$ be the degree $i$ elementary symmetric polynomial in $X_1,\dots,X_n$. Recall that the $s_i$ are defined by $$ f(T):=(T-X_1)\cdots(T-X_n)=T^n+\sum_{i=1}^n\ (-1)^i\ s_i\ T^{n-i}. $$ We abbreviate $X_1,\dots,X_n$ by $X_\bullet$, and $s_1,\dots,s_n$ by $s_\bullet$. Let $G$ the group of permutations of the $X_i$, and $A[X_\bullet]^G\subset A[X_\bullet]$ the fixed ring. For $\alpha\in\mathbb N^n$ put $$ X^\alpha:=X_1^{\alpha_1}\cdots X_1^{\alpha_1},\quad s^\alpha:=s_1^{\alpha_1}\cdots s_1^{\alpha_1}. $$ Write $\Gamma$ for the set of those $\alpha\in\mathbb N^n$ which satisfy $\alpha_i<i$ for all $i$, and put $$ X^\Gamma:=\{X^\alpha\ |\ \alpha\in\Gamma\}. $$

FUNDAMENTAL THEOREM OF SYMMETRIC POLYNOMIALS. The $s_i$ generate the $A$-algebra $A[X_\bullet]^G$.

PROOF. Observe that the map $u:\mathbb N^n\to\mathbb N^n$ defined by $$ u(\alpha)_i:=\alpha_i+\cdots+\alpha_n $$ is injective. Order $\mathbb N^n$ lexicographically, note that the leading term of $s^\alpha$ is $X^{u(\alpha)}$, and argue by induction on the lexicographical ordering of $\mathbb N^n$.

EDIT 2.

Polynomial Identities

Michael Artin writes:

It is possible to formalize the above discussion and to prove a precise theorem concerning the validity of identities in an arbitrary ring. However, even mathematicians occasionally feel that it isn't worthwhile making a precise formulation---that it is easier to consider each case as it comes along. This is one of those occasions.

I'll disobey and make a precise formulation (taken from Bourbaki). If $A$ is a commutative ring and $T_1,\dots,T_k$ are indeterminates, let us denote the obvious morphism form $\mathbb Z[T_1,\dots,T_k]$ to $A[T_1,\dots,T_k]$ by $f\mapsto\overline f$.

Let $X_1,\dots,X_m,Y_1,\dots,Y_n$ be indeterminates.

Let $f_1,\dots,f_n$ be in $\mathbb Z[X_1,\dots,X_m]$.

Let $g$ be in $\mathbb Z[Y_1,\dots,Y_n]$.

The expression $g(f_1,\dots,f_n)$ denotes then a well-defined polynomial in $\mathbb Z[X_1,\dots,X_m]$.

If this polynomial is the zero polynomial, say that $(f_1,\dots,f_n,g)$ is an $(m,n)$-polynomial identity.

The "theorem" is this:

If $(f_1,\dots,f_n,g)$ is an $(m,n)$-polynomial identity, and if $x_1,\dots,x_m$ are in $A$, where $A$ is any commutative ring, then $$g(f_1(x_1,\dots,x_m),\dots,f_n(x_1,\dots,x_m))=0.$$

Exercise: Is $$(X_1^3-X_2^3,X_1-X_2,X_1^2+X_1X_2+X_2^2,Y_1-Y_2Y_3)$$ a $(2,3)$-polynomial identity?

Clearly, the multiplicativity of determinants and the Cayley-Hamilton can be expressed in terms of polynomial identities in the above sense.

Exterior Algebras

To prove the multiplicativity of determinants, one can also proceed as follows.

Let $A$ be a commutative ring and $M$ an $A$-module. One can show that there is an $A$-algebra $\wedge(M)$, called the exterior algebra of $M$ [here "algebra" means "not necessarily commutative algebra"], and an $A$-linear map $e_M$ from $M$ to $\wedge(M)$ having the following property:

For every $A$-linear map $f$ from $M$ to an $A$-algebra $B$ satisfying $f(x)^2=0$ for all $x$ in $M$, there is a unique $A$-algebra morphism $F$ from $\wedge(M)$ to $B$ such that $F\circ e_M=f$.

One can prove $e_M(x)^2=0$ for all $x$ in $M$. This easily implies that $\wedge$ is a functor from $A$-modules to $A$-algebras.

Let $\wedge^n(M)$ be the submodule of $\wedge(M)$ generated by the $e_M(x_1)\cdots e_M(x_n)$, where the $x_i$ run over $M$. Then $\wedge^n$ is a functor from $A$-modules to $A$-modules.

One can show that the $A$-module $\wedge^n(A^n)$ is isomorphic to $A$. For any endomorphism $f$ of $A^n$, one defines $\det(f)$ as being $\wedge^n(f)$. The multiplicativity is then obvious.

Pierre-Yves Gaillard
  • 18,672
  • 3
  • 44
  • 102
  • 1
    +1 for the detailed discussion of permanence of identities, esp. the very cool proof of lemma 1 using the discriminant. – Ben Blum-Smith Aug 29 '11 at 14:13
  • This seems like a lot of work to prove such a (relatively) simple claim, but as the principle outlined works in such amazing generality, I find that it is truly something worth knowing. – Mark Aug 29 '11 at 23:55
  • @Pierre-Yves: Could you please explain why it helps to have different eigenvalues in a triangular matrix in order to prove that it is in the closure of the diagonalisable matrices? – Rasmus Sep 11 '11 at 07:46
  • Dear @Rasmus: Consider the following subsets of $M:=M_n(\mathbb C)$: $T$ the lower triangular matrices, $E$ the matrices with $n$ distinct eigenvalues, and $D$ the diagonalizable matrices. We want to show $\overline D=M$. ["Overline" means "closure", for the usual topology.] There are 2 claims: (1) $T\subset\overline D\Rightarrow\overline D=M$, (2) $T\subset\overline E\subset\overline D$. Please tell me if (1) or/and (2) look unclear to you. – Pierre-Yves Gaillard Sep 11 '11 at 09:11
  • @Rasmus: I'm using the fact that eigenvectors corresponding to distinct eigenvalues are linearly independent to derive $E\subset D$, and thus $\overline E\subset\overline D$. – Pierre-Yves Gaillard Sep 11 '11 at 09:56
  • @Pierre-Yves: Thank you very much. Now I understand your reasoning. I didn' realise that matrices with distinct eigenvalues are always diagonalizable. For (1) you are using that $D$ and thus also $\overline D$ is closed under conjugation, I suppose. – Rasmus Sep 11 '11 at 13:23
  • @Rasmus: You're welcome. Yes, I'm using this closure property. – Pierre-Yves Gaillard Sep 11 '11 at 13:38
  • Terrific discussion and proof,I'm going to bookmark it for future reference when I'm teaching abstract algebra! – Mathemagician1234 Jun 22 '14 at 05:09
  • Dear @Mathemagician1234: Thank you very much for your kind comment! – Pierre-Yves Gaillard Jun 22 '14 at 05:52
23

There are a lot of answers already posted, but I like this one based on the permutations-based definition of the determinant. It's a definition that is equivalent to other definitions, and depending on your book/background, you can prove the equivalence yourself. For an $n\times n$ matrix $A$, define $\det(A)$ by:

\begin{align*} \det(A) & = \sum_{\sigma\in S_n}(-1)^{\sigma}\prod_{i=1}^nA_{i,\sigma(i)} \end{align*}

where

  • $S_n$ is the permutation group on $n$ objects
  • $(-1)^{\sigma}$ is $1$ when $\sigma$ is an even permutation and $-1$ for an odd permutation.

Just apply this to $2\times2$ and $3\times3$ matrices, and you will get familiar formulas.

Now the proof below is a lot of symbol pushing and reindexing, and then a big subset of terms that are grouped together in the right way are seen to sum to zero. I would generally prefer one of the more geometric proofs already offered for this specific question. But at the same time, as an algebraist, I like to raise awareness of the permutations-based definition.

\begin{align*} \det(AB) & = \sum_{\sigma\in S_n}(-1)^\sigma\prod_{l=1}^n(AB)_{l,\sigma(l)}\\ & = \sum_{\sigma\in S_n}(-1)^\sigma\prod_{l=1}^n\left(\sum_{k=1}^nA_{l,k}B_{k,\sigma(l)}\right) \end{align*}

We'd like to swap the inner sum and product. In general, $\prod_{l=1}^n\left(\sum_{k=1}^mc_{l,k}\right) = \sum_{\bar{k}}\left(\prod_{l=1}^nc_{l,k_l}\right)$, where the second sum is over all $\bar{k}=(k_1,k_2,\ldots,k_n)$ with each $k_l$ in $\left\{1,2,\ldots ,m\right\}$. Here we have a product of sums with $m=n$. Therefore,

\begin{align} \det(AB) & = \sum_{\sigma\in S_n}(-1)^\sigma\sum_{\bar{k}}\left(\prod_{l=1}^nA_{l,k_l}B_{k_l,\sigma(l)}\right)\\ & = \sum_{\bar{k}}\sum_{\sigma\in S_n}(-1)^\sigma\left(\prod_{l=1}^nA_{l,k_l}B_{k_l,\sigma(l)}\right) \\ \end{align}

At this point, there are two types of $\bar{k}$ to consider. Remember, each $\bar{k}$ is an $n$-tuple of integers between $1$ and $n$. Some $n$-tuples have repeated entries, and some don't. If $\bar{k}$ has no repeated entries, it defines a permutation $\tau:\{1,2,\ldots , n\}\to\{1,2,\ldots , n\}$ which sends each $l$ to $k_l$.

Suppose $\bar{k}$ has a repeated entry: $k_p=k_q$. Then we can pair up terms in the inner sum to cancel each other out. Specifically, pair up each $\sigma$ with $\sigma\cdot(p\;q)$, where $(p\;q)$ is the transposition that swaps position $p$ with $q$. The contribution of these two terms to the inner sum is

\begin{align*} & \phantom{{}={}}\pm\left(\left(\prod_{l=1}^nA_{l,k_l}B_{k_l,\sigma(l)}\right)-\left(\prod_{l=1}^nA_{l,k_l}B_{k_l,\sigma((p\;q)l)}\right)\right)\\ &= \pm\left(\left(\prod_{l=1}^nA_{l,k_l}\right)\left(\prod_{l=1}^nB_{k_l,\sigma(l)}\right)-\left(\prod_{l=1}^nA_{l,k_l}\right)\left(\prod_{l=1}^nB_{k_l,\sigma((p\;q)l)}\right)\right)\\ &= \pm\left(\left(\prod_{l=1}^nA_{l,k_l}\right)\left(\prod_{l=1}^nB_{k_l,\sigma(l)}\right)-\left(\prod_{l=1}^nA_{l,k_l}\right)\left(\prod_{l'=1}^nB_{k_{l'},\sigma(l'))}\right)\right) \end{align*}

where the final product has been reindexed with $l'=(p\;q)l$, and we have made use of the fact that $k_l=k_{l'}$ for all $l$. The overall difference is clearly zero. So in the earlier equation for $\det(AB)$, the only terms in the inner sum that need be considered are those where $\bar{k}$ defines a permutation $\tau$.

\begin{align*} \det(AB) & = \sum_{\tau\in S_n}\sum_{\sigma\in S_n}(-1)^\sigma\left(\prod_{l=1}^nA_{l,\tau(l)}B_{\tau(l),\sigma(l)}\right) \end{align*}

Reindexing the inner sum with $\sigma = \sigma'\tau$,

\begin{align*} \det(AB) & = \sum_{\tau\in S_n}\sum_{\sigma'\in S_n}(-1)^{\sigma'\tau}\left(\prod_{l=1}^nA_{l,\tau(l)}B_{\tau(l),\sigma'\tau(l)}\right) \\ & = \sum_{\tau\in S_n}\sum_{\sigma'\in S_n}(-1)^{\sigma'\tau}\left(\prod_{l=1}^nA_{l,\tau(l)}\right)\left(\prod_{l=1}^nB_{\tau(l),\sigma'\tau(l)}\right) \end{align*}

Reindexing the final product with $l'=\tau(l)$,

\begin{align*} & = \sum_{\tau\in S_n}\sum_{\sigma'\in S_n}(-1)^{\sigma'\tau}\left(\prod_{l=1}^nA_{l,\tau(l)}\right)\left(\prod_{l'=1}^nB_{l',\sigma'(l')}\right)\\ & = \left(\sum_{\tau\in S_n}(-1)^{\tau}\prod_{l=1}^nA_{l,\tau(l)}\right)\left(\sum_{\sigma'\in S_n}(-1)^{\sigma'}\prod_{l'=1}^nB_{l',\sigma'(l')}\right)\\ & = \det(A)\det(B) \end{align*}

2'5 9'2
  • 51,425
  • 6
  • 76
  • 143
20

This isn't strictly an answer to the question because it is not a rigorous argument that $\det(AB)=\det(A)\det(B)$. But for me the idea I will share carries a lot of useful insight so I offer it in that spirit. It is based on the geometric interpretation of the determinant:

Interpreting $A$ as a linear transformation of $n$-dimensional space, $\det(A)$ is the effect of $A$ on $n$-volumes. More precisely, if a set $S$ has $n$-dimensional measure $k$, then the image set $A(S)$ has $n$-dimensional measure $\left|\det(A)\right|k$, i.e. $\left|\det(A)\right|$ times as big. The sign of $\det(A)$ tells you whether $A$ preserves or reverses orientation.

Examples:

Let $n=2$ so we are dealing with areas in the plane.

If $A$ is a rotation matrix, then its effect on the plane is a rotation. $\det(A)$ is positive 1 because $A$ actually preserves all areas (so absolute value 1) and preserves orientation (so positive).

If $A$ has the form $kI$, $k$ positive, then $\det(A)$ is $k^2$. This is because the geometric effect of $A$ is a dilation by a factor of $k$, so $A$'s effect on area is to multiply it by $k^2$.

If $A$ has $1, -1$ on the main diagonal and zero elsewhere, then it corresponds to reflection in the $x$-axis. Here the determinant is $-1$ because though $A$ preserves areas, it reverses the orientation of the plane.

Once you buy this interpretation of the determinant, $\det(AB)=\det(A)\det(B)$ follows immediately because the whole point of matrix multiplication is that $AB$ corresponds to the composed linear transformation $A \circ B$. Looking at the magnitudes and the signs separately: $A \circ B$ scales volumes by $\left|\det(B)\right|$ and then again by $\left|\det(A)\right|$, so in total by $\left|\det(A)\right|\left|\det(B)\right|=\left|\det(A)\det(B)\right|$. I'll let you think about the signs and orientations.

This argument becomes a rigorous proof via a proof of the geometric interpretation of the determinant. How to prove this would depend on what definition is being used for the determinant. If we defined the determinant as the effect of $A$ on $n$-volumes (in the sense above), we would skip the need for this step. (We'd still have to prove that for a linear transformation the effect on $n$-volume doesn't depend on the set; and to avoid circularity we'd need a way to define orientation that didn't depend on the determinant - in my experience many definitions of orientation do depend on it.) On the other hand, if we define the determinant in any of the more usual algebraic ways, we have something left to prove here. But I hope this way of looking at things is useful in any case.

Ben Blum-Smith
  • 19,309
  • 4
  • 52
  • 115
  • 3
    Actually this proof can be made algebraic using exterior powers. This also works over any commutative ring $R$. If $A : R^n \to R^n$ is $R$-linear, then $\mathrm{det}(A)$ is the unique element of $R$ such that $\Lambda^n(A)(v_1 \wedge \dotsc \wedge v_n)$ scales $v_1 \wedge \dotsc \wedge v_n$ by $\mathrm{det}(A)$. Once this characterization is known (which follows easily from $\Lambda^n(R^n) \cong R$), the multiplicativity is immediate because $\Lambda^n(A \circ B) = \Lambda^n(A) \circ \Lambda^n(B)$. – Martin Brandenburg Jul 31 '14 at 15:18
  • 2
    I agree that in its 'simple' forms this is far from a proof, but this is by far my favorite way of _explaining_ why $\det(AB)=\det(A)\det(B)$, especially to graphics and game programmers (who are the main folks I talk to about determinants these days). – Steven Stadnicki Sep 08 '18 at 01:01
6

If $A$ is singular, then $\det(AB)$ and $\det(A)$ are both zero. So, we can suppose that $A$ is non-singular. As $A$ can be row-reduced to the identity matrix, we can find a sequence of elementary matrices, $E_1, E_2,..., E_s$, such that

$E_1 E_2\cdots E_sA=I$

Hence,

$A = F_1 F_2\cdots F_s$

where $F_i$ is the inverse of $E_i$, and is also an elementary matrix. Now the proof reduces to showing that for an elementary matrix $E$ and an arbitrary matrix $M$, we have

$\det(EM) = \det(E)\det(M)$

It is true because

(i) if $E$ represents a row exchange, then $\det(E)=-1$ and $\det(EM) = -\det(M)$,

(ii) if $E$ represents the multiplication of the $i$-th row by a non-zero constant $c$, then $\det(E)=c$ and $\det(EM) = c\det(M)$,

(iii) if $E$ represents adding $k$ times row $j$ to row $i$, then $\det(E)=1$ and $\det(EM)=\det(M)$.

user26857
  • 1
  • 13
  • 62
  • 125
tattwamasi amrutam
  • 12,182
  • 4
  • 31
  • 62
  • You have assumed $A$ is invertible, for which the rows become linearly independent and you can guarantee the existence of operations which transform the matrix into the identity matrix. You need to consider additionally the case when $\text{rank} A < n$ where $n$ is the size of the matrix. – IAmArjunSirStudent May 21 '21 at 03:07
6

(This is actually not an answer, but for some reason it seems I can't make comments. That lil window doesn't appear)

When you have diagonalized matrices, it's just very easy to see that $det(AB)=det(A)*det(B)$.

Since the entries multiply like scalars and the determinant is then the product of all of them. Keep in mind that the determinant doesn't change when doing a basis-transformation.

Unfortunately, not all matrices with $det(a)\neq 0 $ are diagonalizable, so this is no proof. But it may give you a hint why things are the way there are.

  • Dear Konstantin, I assume you wanted to make a comment to my answer. Thanks. If you can't make comments for the time being, it's because you don't have enough reputation points. I gave you all the reputation points I could. - I agree that not all matrices are diagonalizable. What I tried to explain in my answer is this: As the function $B\mapsto\det(AB)-\det(A)\det(B)$ is continuous, and the diagonalizable matrices are dense in all the matrices, we're done. – Pierre-Yves Gaillard Aug 29 '11 at 12:45
  • Thank you. Now it works. I wanted to provide an easy case. For me, such ideas are always helpful to memorize facts. – Konstantin Schubert Aug 29 '11 at 13:00
  • I admit not yet having understood your proof, but now I guess I got the idea: so $det(AB)-det(A)det(B)$ equals zero in a dense subset of $B$. Because it is continuous, it's zero everywhere. Cool! – Konstantin Schubert Aug 29 '11 at 13:07
  • This answer is unclear about _which_ matrices can be supposed to be diagonalized. You cannot assume _both_ $A$ and $B$ diagonalized, or you'd have only proved something for commuting matrices. But you can assume (only) $B$ diagonal, and then you can get the result from the multilinearity by columns of the determinant (actually just the mutliplicative part of that). – Marc van Leeuwen Oct 09 '12 at 08:48
  • @MarcvanLeeuwen, this answer isn't really an answer as it doesn't aim to give a proof for the problem stated above. I just wanted to give an example for the case that both matrices A and B happen to be diagonalized. – Konstantin Schubert Oct 10 '12 at 01:00
5

using simplifying "$\prod_{a\in A}\sum_{b\in B_a}{h(a,b)}$" I can give this computational proof:

$$\det AB = \sum_{\sigma \in S_n}sgn(\sigma)\prod_{k=1}^n\sum_{i=1}^n{A_{ki}B_{i\sigma(k)}}$$ $$=\sum_{\sigma \in S_n}sgn(\sigma)\sum_{f:\mathbb{N}_n\to\mathbb{N}_n}\prod_{k=1}^n{A_{kf(k)}B_{f(k)\sigma(k)}}$$

$$=\sum_{f:\mathbb{N}_n\to\mathbb{N}_n}\prod_{k=1}^n{B_{kf(k)}}\sum_{\sigma \in S_n}sgn(\sigma)\prod_{k=1}^n{A_{f(k)\sigma(k)}}$$ $$=\sum_{f\in S_n}\prod_{k=1}^n{B_{kf(k)}}\sum_{\sigma \in S_n}sgn(\sigma)\prod_{k=1}^n{A_{f(k)\sigma(k)}}$$

$$=\left(\sum_{f\in S_n}sgn(f)\prod_{k=1}^n{B_{kf(k)}}\right)\left( \sum_{\sigma \in S_n}sgn(\sigma)\prod_{k=1}^n{A_{k\sigma(k)}}\right)$$ $$=\det A \det B$$

3

Friedberg's Linear Algebra offers an axiomatic approach to determinants by showing how to characterize the determinant in terms of three key properties.

Definition. A function $\delta:M_{n\times n}(F)\to F$ is called an $n$-linear function if it is a linear function of each row of an $n\times n $ matrix when the remaining $n-1$ rows are held fixed. $\delta$ is called alternating if, for each $A\in M_{n\times n}(F)$, we have $\delta(A)=0$ whenever two adjacent rows of $A$ are identical.

Theorem. If $\delta:M_{n\times n}(F)\to F$ is an alternating $n$-linear function such that $\delta(I)=1$, then $\delta(A)=\det(A)$ for every $A\in M_{n\times n}(F)$.

To summarize, the three properties are:

  • The determinant of the identity matrix is $1$.
  • The determinant changes sign when two rows are exchanged.
  • The determinant depends linearly on the first row.

Now we sketch the proof of $\det(AB)=\det(A)\det(B)$.

Assume that $A$ and $B$ are nonsingular, otherwise $AB$ is singular, and the equation $\det(AB)=\det(A)\det(B)$ is easily verified. The key point is that we prove that the ratio $d(A)=\det(AB)/\det(B)$ has the three properties. Then $d(A)$ must equal $\det(A)$.

3

Physics-oriented people might like to prove this statement as follows, using Einstein summation convention and the Levi-Civita symbol $\varepsilon$: For any matrix $A=\{A_j^i\}_{1\leq i,j\leq n}$ (with entries from a commutative ring), we note that the tensor $$\pi_{j_1j_2...j_n}:= \varepsilon_{i_1...i_n}A_{j_1}^{i_1}...A_{j_n}^{i_n}$$ is antisymmetric w.r.t. any transposition of a pair of its indices, hence $\pi \propto\varepsilon$ and we define the determinant of $A$ as the proportionality constant between $\pi$ and $\varepsilon$. Then $\varepsilon_{i_1...i_n}A_{j_1}^{i_1}...A_{j_n}^{i_1}=(\det A)\varepsilon_{j_1...j_n}$ holds by definition. Consider now the matrix-product $AB=\{A_j^iB_{i}^k\}_{1\leq j,k \leq n}$. We have $$\det(AB)\varepsilon_{j_1...j_n}=\varepsilon_{k_1...k_n}(A_{j_1}^{i_1}B_{i_1}^{k_1})...(A_{j_n}^{i_n}B_{i_n}^{k_n})=(\varepsilon_{k_1...k_n}B_{i_1}^{k_1}...B_{i_n}^{k_n})(A_{j_1}^{i_1}...A_{j_n}^{i_n})=(\det B)\varepsilon_{i_1...i_n}(A_{j_1}^{i_1}...A_{j_n}^{i_n})=(\det B)(\varepsilon_{i_1...i_n}A_{j_1}^{i_1}...A_{j_n}^{i_n})=(\det B)(\det A)\varepsilon_{j_1...j_n}.$$ Therefore $\det(AB)=\det(A)\det(B)$.

Thibaut Demaerel
  • 1,698
  • 9
  • 15
1

Elementary proof of $\det (AB) = \det (A) \det (B) $ for square matrices :

For any square matrix $ $, $ A = PLU$ this can be easily shown using elementary row operations

P is permutation matrix, L lower triangular matrix, U upper triangular matrix.

Using elementary row operations :

$ \det (PB) = \det (B) \det (P) $,

$\det (LB) = \det (L) \det (B) $,

$\det (UB) = \det (U) \det (B) $

Regarding the last 2 identities: You can obtain $ LB $ by performing row additions to the matrix $ G=TB $, where $ T $ is a diagonal matrix such that $ T_{ii} = L_{ii} $.

Starting from the last row, take each row of $ G $ and add multiples of rows above it to obtain the rows of $ LB $, this implies $ det (LB) = det (TB) =det (T) det (B)= det (L) det (B) $

Similar trick works for $ UB $ by adding multiples of rows below a given row starting from the first row.

ibnAbu
  • 1,862
  • 7
  • 25