763

In my linear algebra class, we just talked about determinants. So far I’ve been understanding the material okay, but now I’m very confused. I get that when the determinant is zero, the matrix doesn’t have an inverse. I can find the determinant of a $2\times 2$ matrix by the formula. Our teacher showed us how to compute the determinant of an $n \times n$ matrix by breaking it up into the determinants of smaller matrices. Apparently there is a way by summing over a bunch of permutations. But the notation is really hard for me and I don’t really know what’s going on with them anymore. Can someone help me figure out what a determinant is, intuitively, and how all those definitions of it are related?

Simon Fraser
  • 2,428
  • 10
  • 27
Jamie Banks
  • 12,214
  • 7
  • 34
  • 39
  • 108
    I just wanted this question to be in the archives, because it's a perennial one that admits a better response than is in most textbooks. – Jamie Banks Jul 25 '10 at 02:26
  • 6
    Hehe, you're going against your own suggestion of asking questions that you actually want answered!! I'm teasing though, I understand your motivation. Can we set a precedent of making seeded questions CW? I kinda like that idea, I will propose it on Meta. I am rambling. – BBischof Jul 25 '10 at 02:34
  • 6
    In case somebody was curious about the trace form of this question, it is more difficult, and is the subject of one of my MO questions :D http://mathoverflow.net/questions/13526/geometric-interpretation-of-trace – BBischof Jul 25 '10 at 02:36
  • @BBischof, see the meta thread for CW discussion – Jamie Banks Jul 25 '10 at 06:10
  • 1
    In the end, I hope this didn't come across as me hating on your question... Yet somehow, I feel like that has happened. – BBischof Jul 25 '10 at 19:17
  • no, I didn't take it that way at all... :) – Jamie Banks Jul 25 '10 at 19:20
  • 4
    I'm confused - did the Katie Banks answer her own question? – user1729 Nov 14 '11 at 15:24
  • 6
    Those reading this forum in '16 may be interested in this video [The Determinant](https://www.youtube.com/watch?v=Ip3X9LOh2dk) which is part of a set of videos that give some very nice insight into intuitive understanding of linear algebra (the essence of it rather) – Hugh Entwistle Oct 11 '16 at 06:43
  • 2
    An awesome resource that I used to motivate my understanding of the determinant: https://www.askamathematician.com/2013/05/q-why-are-determinants-defined-the-weird-way-they-are/ – D.R. Mar 02 '19 at 04:38

17 Answers17

466

Your trouble with determinants is pretty common. They’re a hard thing to teach well, too, for two main reasons that I can see: the formulas you learn for computing them are messy and complicated, and there’s no “natural” way to interpret the value of the determinant, the way it’s easy to interpret the derivatives you do in calculus at first as the slope of the tangent line. It’s hard to believe things like the invertibility condition you’ve stated when it’s not even clear what the numbers mean and where they come from.

Rather than show that the many usual definitions are all the same by comparing them to each other, I’m going to state some general properties of the determinant that I claim are enough to specify uniquely what number you should get when you put in a given matrix. Then it’s not too bad to check that all of the definitions for determinant that you’ve seen satisfy those properties I’ll state.

The first thing to think about if you want an “abstract” definition of the determinant to unify all those others is that it’s not an array of numbers with bars on the side. What we’re really looking for is a function that takes N vectors (the N columns of the matrix) and returns a number. Let’s assume we’re working with real numbers for now.

Remember how those operations you mentioned change the value of the determinant?

  1. Switching two rows or columns changes the sign.

  2. Multiplying one row by a constant multiplies the whole determinant by that constant.

  3. The general fact that number two draws from: the determinant is linear in each row. That is, if you think of it as a function $\det: \mathbb{R}^{n^2} \rightarrow \mathbb{R}$, then $$ \det(a \vec v_1 +b \vec w_1 , \vec v_2 ,\ldots,\vec v_n ) = a \det(\vec v_1,\vec v_2,\ldots,\vec v_n) + b \det(\vec w_1, \vec v_2, \ldots,\vec v_n),$$ and the corresponding condition in each other slot.

  4. The determinant of the identity matrix $I$ is $1$.

I claim that these facts are enough to define a unique function that takes in N vectors (each of length N) and returns a real number, the determinant of the matrix given by those vectors. I won’t prove that, but I’ll show you how it helps with some other interpretations of the determinant.

In particular, there’s a nice geometric way to think of a determinant. Consider the unit cube in N dimensional space: the set of N vectors of length 1 with coordinates 0 or 1 in each spot. The determinant of the linear transformation (matrix) T is the signed volume of the region gotten by applying T to the unit cube. (Don’t worry too much if you don’t know what the “signed” part means, for now).

How does that follow from our abstract definition?

Well, if you apply the identity to the unit cube, you get back the unit cube. And the volume of the unit cube is 1.

If you stretch the cube by a constant factor in one direction only, the new volume is that constant. And if you stack two blocks together aligned on the same direction, their combined volume is the sum of their volumes: this all shows that the signed volume we have is linear in each coordinate when considered as a function of the input vectors.

Finally, when you switch two of the vectors that define the unit cube, you flip the orientation. (Again, this is something to come back to later if you don’t know what that means).

So there are ways to think about the determinant that aren’t symbol-pushing. If you’ve studied multivariable calculus, you could think about, with this geometric definition of determinant, why determinants (the Jacobian) pop up when we change coordinates doing integration. Hint: a derivative is a linear approximation of the associated function, and consider a “differential volume element” in your starting coordinate system.

It’s not too much work to check that the area of the parallelogram formed by vectors $(a,b)$ and $(c,d)$ is $\Big|{}^{a\;b}_{c\;d}\Big|$ either: you might try that to get a sense for things.

Saeed
  • 1,725
  • 2
  • 13
Jamie Banks
  • 12,214
  • 7
  • 34
  • 39
  • 8
    Great answer. We were taught the determinant as the generalized volume function in our algebra class. – Neil G Aug 28 '10 at 09:26
  • 1
    I hope you don't mind but I corrected a small typo in the third property of the determinant and added some Latex to make the identity a little bit easier to read. – Adrián Barquero Nov 14 '10 at 18:38
  • 1
    Nicely done. We should all keep this in mind when we teach determinants. – Chris Leary Nov 11 '11 at 19:30
  • 35
    Just out of curiosity, *who are you talking to* with the first sentence? Didn't *you* ask the question?!? Either way, I point students to this Q (&A) all the time for determinant help. – The Chaz 2.0 Apr 18 '12 at 02:18
  • 17
    @TheChaz this question was asked near the beginnings of Math.SE, when there was a need to populate the site with questions before it was opened up to 'the public'. In any case, answering your own questions is [explicitly encouraged](http://blog.stackoverflow.com/2012/05/encyclopedia-stack-exchange/) nowadays. – Chris Taylor May 24 '12 at 07:51
  • 15
    To see that the geometric interpretation (volume of the image of cube) satisfies the multilinearity property it is not enough to stack two "aligned" blocks, this is again the multiplication of a column by a scalar. You should deal with two "not aligned" blocks produced by changing a single column vector, and see that the sum of their volumes is the volume of the block obtained putting the sum of the two vectors. This is not so easy to catch... – Emanuele Paolini Feb 20 '13 at 09:19
  • 1
    I want to share [this](http://ocw.mit.edu/courses/mathematics/18-06-linear-algebra-spring-2010/video-lectures/lecture-18-properties-of-determinants/) link. It gives a really good answer to this question, in my opinion. – aeyalcinoglu Aug 22 '13 at 13:41
  • Is there anyone who automatically does understand the determinant immediately upon learning it?? :\ – user124384 Apr 17 '14 at 01:45
  • Why is axiom (2) included? (3) renders it redundant... – Optional Jun 18 '14 at 03:57
  • 1
    @JamieBanks Is there a proof that those properties are enough? You said you wouldn't prove it here, but I'd like to see one if it exists. – Stan Shunpike Feb 18 '15 at 06:27
  • I have a question, under point 4. It says that this function is unique. In this context, what do we mean by unique? – Terrence J Apr 26 '16 at 10:00
  • I've learned or relearned different parts of linear algebra from six to ten different textbooks at different times in my life--I'll never remember how to compute a determinant if I haven't done so recently--and I have never seen this explanation of its meaning. Finally! – Mars Jun 13 '16 at 18:00
  • @StanShunpike: The idea behind the proof of uniqueness is the following. Write your input vectors $v_i$ in a fixed basis $e_1, \ldots, e_n$. Using (3), expand everything as much as possible till you get a sum of det's whose inputs are just the basis vectors. Using (1), $\det(e_1, e_1) = 0$ and similarly, so the n inputs to each of these dets are distinct basis vectors in some order. Again using (1), reorder them so they're each $\det(e_1, e_2, \ldots, e_n)$, which is 1 by (4). This forces the value of $\det$ on $v_1, \ldots, v_n$. – Joshua P. Swanson Sep 18 '16 at 06:33
  • 2
    The sign and orientation are not hard to motivate. Compute $\det(ce_1, e_2, e_3)$ pictorially as c varies from 1 to -1. The volume is $|c|$ in each case, and to make this process smooth we need to pick up a negative for $c<0$. You can rigidly rotate $(-e_1, e_2, e_3)$ into $(e_2, e_1, e_3)$ which reasonably must preserve $\det$, so $\det(e_2, e_1, e_3) = -1$. The fact that you can make these sorts of choices consistently (and largely without actually making choices) is at first blush a minor miracle, but anyway that's not an issue of motivation. – Joshua P. Swanson Sep 18 '16 at 06:38
  • 1
    Can anyone explain me the multilinearity of volume of parallelepipeds? – Mahbub Alam May 04 '17 at 12:23
  • 5
    To be blunt, if the question was coming from a real questioner, this answer would not help. You've introduced more likely alien notation ($\mathbb{R}^n^2$), given a geometric interpretation that is similar to the one in most texts (area of a parallelogram) and that doesn't give any intuition at all. After reading this answer I have no better idea how someone would have plausibly come to the notion of a determinant in the first place. A bunch of people that already understand the concept appreciating the answer is a poor test for deciding the educational value of an explanation. – Joseph Garvin Sep 10 '17 at 00:21
  • In number 2 of those general properties, doesn’t dividing a row by a constant multiply the determinant by that constant? – tyobrien Apr 13 '18 at 00:54
  • Question: is it equivalent to require that the vector $ w_1$ in property number 3) is $w_1 = v_k$ for some $k$ with $1\leq k \leq n$? Allowing it to be *any* vector seems like a huge constraint on the determinant. – Markus Klyver Jan 01 '19 at 02:32
  • Emil Artin, *Galois Theory* (second edition 1944, reprinted by Dover 1998), pages 11-20, characterises the determinant function in a similar way, and proves that it exists and is unique. – Calum Gilhooley Mar 10 '19 at 20:03
  • 1
    @ChrisTaylor While answering your own question is OK and encouraged, I don't think it is the same for questioning your own answer (I have this cool explanation to give, now if only somebody would set me up with a corresponding question). Especially, the naivety of the question ("in my class we just talked about determinants", suggesting the perspective of a student rather than a teacher) is clearly feigned and contradicted by an immediate answer by OP. – Marc van Leeuwen Nov 04 '19 at 08:21
  • 1
    This answer is confused about the roles of rows and columns. If you are viewing (in fact specifying) the determinant as a function of the columns of a matrix, then you should talk about multi-linear and alternating conditions _in the columns_ only. You will get the other kind of multi-linearity for free (once one deduces invariance under transposition) but it should not be part of the specification. In fact (3) talks about linearity by rows, but (if I interpret the notation with respect to what was said earlier) specifies linearity by columns. Hard to make anything definite out of it. – Marc van Leeuwen Nov 04 '19 at 08:29
  • @MarcvanLeeuwen I honestly don't see a problem with it, especially given that the question is one that many people have struggled with and the answer is clear and helpful. Quoting from the blog post I linked to (emphasis mine), Stack Exchange is "not just a Q&A platform: it’s also a place where you can publish things that you’ve learned: recipes, FAQs, HOWTOs, walkthroughs, and even bits of product documentation, **as long you format it as a question and answer**." (By the way, do you realize that the comment you're replying to is seven and a half years old?) – Chris Taylor Nov 04 '19 at 09:16
256

You could think of a determinant as a volume. Think of the columns of the matrix as vectors at the origin forming the edges of a skewed box. The determinant gives the volume of that box. For example, in 2 dimensions, the columns of the matrix are the edges of a rhombus.

You can derive the algebraic properties from this geometrical interpretation. For example, if two of the columns are linearly dependent, your box is missing a dimension and so it's been flattened to have zero volume.

MJD
  • 62,206
  • 36
  • 276
  • 489
John D. Cook
  • 6,710
  • 2
  • 25
  • 38
  • 62
    If I may, I would add to this answer (which I think is a very good one) in two minor aspects. First, a determinant also has a sign, so we want the concept of oriented volume. (This is somewhat tricky, but definitely important, so you might as well have it in mind when you're learning about "right hand rules" and such.) Second, I think better than a volume is thinking of the determinant as the multiplicative change in volume of a parallelopiped under the linear transformation. (Of course you can always take the first one to be the unit n-cube and say that you are just dividing by one.) – Pete L. Clark Jul 28 '10 at 20:08
  • 9
    +1: I like this answer because there is a direct link to some application in physics: In special relativity we are talking of the conservation of space-time-*volume*, which means that the determinant of the transformation matrix is const. 1 – vonjd Jan 16 '11 at 10:22
  • 8
    I'm ten years late, but [here](https://www.youtube.com/watch?v=Ip3X9LOh2dk) is a video by 3blue1brown on the determinant which uses the same geometric interpretation. – Jonas Nov 29 '20 at 11:49
137

In addition to the answers, above, the determinant is a function from the set of square matrices into the real numbers that preserves the operation of multiplication: \begin{equation}\det(AB) = \det(A)\det(B) \end{equation} and so it carries $some$ information about square matrices into the much more familiar set of real numbers.

Some examples:

The determinant function maps the identity matrix $I$ to the identity element of the real numbers ($\det(I) = 1$.)

Which real number does not have a multiplicative inverse? The number 0. So which square matrices do not have multiplicative inverses? Those which are mapped to 0 by the determinant function.

What is the determinant of the inverse of a matrix? The inverse of the determinant, of course. (Etc.)

This "operation preserving" property of the determinant explains some of the value of the determinant function and provides a certain level of "intuition" for me in working with matrices.

Mars
  • 1,124
  • 8
  • 22
KenWSmith
  • 1,960
  • 1
  • 14
  • 12
  • 13
    +1 for including the *questions*. Many of them. Good ones. Especially the "So which square matrices do *not* have multiplicative inverses?" pair. And for featuring a nice doggy in your portrait! – n611x007 Jan 30 '13 at 17:40
  • 1
    It is actually the universal such function in the sense that every other such function is composition of the determinant and a map $\mathbb{F} \to\mathbb{F} $. – sss89 Nov 17 '21 at 15:47
60

Here is a recording of my lecture on the geometric definition of determinants:

Geometric definition of determinants

It has elements from the answers by Jamie Banks and John Cook, and goes into details in a leisurely manner.

Amritanshu Prasad
  • 1,686
  • 13
  • 15
  • 1
    This should be higher IMO. I think to have an intuition (geometrical understanding) of the determinant, you need a geometrical understanding of matrices first. – jds Nov 01 '18 at 14:03
  • @Amritanshu, can you please see https://math.stackexchange.com/questions/4154766/how-to-find-the-sign-of-the-determinant/4155057?noredirect=1#comment8603768_4155057 and give any hint to find the sign of the det using the geometric definition of determinant. Thanks. – prince Jun 04 '21 at 11:01
52

I too find the way determinants are treated in exterior algebra most intuitive. The definition is given on page 46 of Landsberg's "Tensors: Geometry and Applications". Two examples below will tell you everything you need to know.

Say, you are give a matrix $$A=\begin{pmatrix}a&b\\c&d\end{pmatrix}$$ and asked to compute its determinant. You can think of the matrix as a linear operator $f:\mathbb R^2\to\mathbb R^2$ defined by

$$\begin{pmatrix}x\\y\end{pmatrix}\mapsto\begin{pmatrix}a&b\\c&d\end{pmatrix}\begin{pmatrix}x\\y\end{pmatrix}.$$

If you define the standard basis vector by $e_1=\begin{pmatrix}1\\0\end{pmatrix}$ and $e_2=\begin{pmatrix}0\\1\end{pmatrix}$, you can then define $f$ by the values it assumes on the basis vectors: $f(e_1)=ae_1+ce_2$ and $f(e_2)=be_1+de_2$.

The linear operator $f$ is extended to bivectors by $$f(e_1\wedge e_2)=f(e_1)\wedge f(e_2).$$

Then you can write

$$f(e_1\wedge e_2)=(ae_1+ce_2)\wedge(be_1+de_2)=(ad-bc)e_1\wedge e_2,$$

where I used distributivity and anticommutativity of the wedge product (the latter implies $a\wedge a=0$ for any vector $a$). So, we get the determinant as a scalar factor in the above equation, that is

$$f(e_1\wedge e_2)=\det(A)\,e_1\wedge e_2.$$

The same procedure works for 3-by-3 matrices, you just need to use a trivector. Say, you are given $$B=\begin{pmatrix}a_{11}&a_{12}&a_{13}\\a_{21}&a_{22}&a_{23}\\a_{31}&a_{32}&a_{33}\end{pmatrix}.$$

It defines a linear operator $g:\mathbb R^3\to \mathbb R^3$

$$\begin{pmatrix}x\\y\\z\end{pmatrix}\mapsto \begin{pmatrix}a_{11}&a_{12}&a_{13}\\a_{21}&a_{22}&a_{23}\\a_{31}&a_{32}&a_{33}\end{pmatrix} \begin{pmatrix}x\\y\\z\end{pmatrix},$$

for which we have

$$g(e_1)=a_{11}e_1+a_{21}e_2+a_{31}e_3,\quad g(e_2)=a_{12}e_1+a_{22}e_2+a_{32}e_3,\quad g(e_3)=a_{13}e_1+a_{23}e_2+a_{33}e_3$$

on the standard basis $e_1=\begin{pmatrix}1\\0\\0\end{pmatrix}$, $e_2=\begin{pmatrix}0\\1\\0\end{pmatrix}$, $e_3=\begin{pmatrix}0\\0\\1\end{pmatrix}$. The operator $g$ is extended to trivectors by $$g(e_1\wedge e_2\wedge e_3)=g(e_1)\wedge g(e_2)\wedge g(e_3),$$

which gives

$$g(e_1\wedge e_2\wedge e_3)=(a_{11}e_1+a_{21}e_2+a_{31}e_3)\wedge(a_{12}e_1+a_{22}e_2+a_{32}e_3)\wedge(a_{13}e_1+a_{23}e_2+a_{33}e_3).$$

If you then follows the rules of $\wedge$ such as distributivity, anticommutativity, and associativity, you get $$g(e_1\wedge e_2\wedge e_3)=\det(B)\, e_1\wedge e_2\wedge e_3.$$

It works in exactly the same way in higher dimensions.

Andrey Sokolov
  • 1,476
  • 10
  • 14
  • 6
    Beautiful answer. As someone who doesn't know any exterior algebra, this also serves as a motivation for the wedge product. – 6005 Nov 15 '16 at 01:37
36

For the record I'll try to give a reply to this old question, since I think some elements can be added to what has been already said.

Even though they are basically just (complicated) expressions, determinants can be mysterious when first encountered. Questions that arise naturally are: (1) how are they defined in general?, (2) what are their important properties?, (3) why do they exist?, (4) why should we care?, and (5) why does their expression get so huge for large matrices?

Since $2\times2$ and $3\times3$ determinants are easily defined explicitly, question (1) can wait. While (2) has many answers, the most important ones are, to me: determinants detect (by becoming 0) the linear dependence of $n$ vectors in dimension $n$, and they are an expression in the coordinates of those vectors (rather than for instance an algorithm). If you have a family of vectors that depend (or at least one of them depends) on a parameter, and you need to know for which parameter values they are linearly dependent, than trying to use for instance Gaussian elimination to detect linear dependence can run into trouble: one might need assumptions on the parameter to assure some coefficient is nonzero, and even then dividing by it gives very messy expressions. Provided the number of vectors equals the dimension $n$ of the space, taking a determinant will however immediately transform the question into an equation for the parameter (which one may or may not be capable of solving, but that is another matter). This is exactly how one obtains an equation in eigenvalue problems, in case you've seen those. This provides a first answer to (4). (But there is a lot more you can do with determinants once you get used to them.)

As for question (3), the mystery of why determinants exist in the first place can be reduced by considering the situation where one has $n-1$ given linearly independent vectors, and asks when a final unknown vector $\vec x$ will remain independent from them, in terms of its coordinates. The answer is that it usually will, in fact always unless $\vec x$ happens to be in the linear span $S$ of those $n-1$ vectors, which is a subspace of dimension $n-1$. For instance, if $n=2$ (with one vector $\vec v$ given) the answer is "unless $\vec x$ is a scalar multiple of $\vec v$". Now if one imagines a fixed (nonzero) linear combination of the coordinates of $\vec x$ (the technical term is a linear form on the space), then it will become $0$ precisely when $\vec x$ is in some subspace of dimension $n-1$. With some luck, this can be arranged to be precisely the linear span $S$. (In fact no luck is involved: if one extends the $n-1$ vectors by one more vector to a basis, then expressing $\vec x$ in that basis and taking its final coordinate will define such a linear form; however you can ignore this argument unless you are particularly suspicious.) Now the crucial observation is that not only does such a linear combination exist, its coefficients can be taken to be expressions in the coordinates of our $n-1$ vectors. For instance in the case $n=2$ if one puts $\vec v={a\choose b}$ and $\vec x={x_1\choose x_2}$, then the linear combination $-bx_1+ax_2$ does the job (it becomes 0 precisely when $\vec x$ is a scalar multiple of $\vec v$), and $-b$ and $a$ are clearly expressions in the coordinates of $\vec v$. In fact they are linear expressions. For $n=3$ with two given vectors, the expressions for the coefficients of the linear combination are more complicated, but they can still be explicitly written down (each coefficient is the difference of two products of coordinates, one form each vector). These expressions are linear in each of the vectors, if the other one is fixed.

Thus one arrives at the notion of a multilinear expression (or form). The determinant is in fact a multilinear form: an expression that depends on $n$ vectors, and is linear in each of them taken individually (fixing the other vectors to arbitrary values). This means it is a sum of terms, each of which is the product of a coefficient, and of one coordinate each of all the $n$ vectors. But even ignoring the coefficients, there are many such terms possible: a whopping $n^n$ of them!

However, we want an expression that becomes $0$ when the vectors are linearly dependent. Now the magic (sort of) is that even the seemingly much weaker requirement that the expression becomes $0$ when two successive vectors among the $n$ are equal will assure this, and it will moreover almost force the form of our expression upon us. Multilinear forms that satisfy this requirement are called alternating. I'll skip the (easy) arguments, but an alternating form cannot involve terms that take the same coordinate of any two different vectors, and they must change sign whenever one interchanges the role of two vectors (in particular they cannot be symmetric with respect to the vectors, even though the notion of linear dependence is symmetric; note that already $-bx_1+ax_2$ is not symmetric with respect to interchange of $(a,b)$ and $(x_1,x_2)$). Thus any one term must involve each of the $n$ coordinates once, but not necessarily in order: it applies a permutation of the coordinates $1,2,\ldots,n$ to the successive vectors. Moreover, if a term involves one such permutation, then any term obtained by interchanging two positions in the permutation must also occur, with an opposite coefficient. But any two permutations can be transformed into one another by repeatedly interchanging two positions; so if there are any terms at all, then there must be terms for all $n!$ permutations, and their coefficients are all equal or opposite. This explains question (5), why the determinant is such a huge expression when $n$ is large.

Finally the fact that determinants exist turns out to be directly related to the fact that signs can be associated to all permutations in such a way that interchanging entries always changes the sign, which is part of the answer to question (3). As for question (1), we can now say that the determinant is uniquely determined by being an $n$-linear alternating expression in the entries of $n$ column vectors, which contains a term consisting of the product of their coordinates $1,2,\ldots,n$ in that order (the diagonal term) with coefficient $+1$. The explicit expression is a sum over all $n!$ permutations, the corresponding term being obtained by applying those coordinates in permuted order, and with the sign of the permutation as coefficient. A lot more can be said about question (2), but I'll stop here.

Marc van Leeuwen
  • 107,679
  • 7
  • 148
  • 306
33

The top exterior power of an $n$-dimensional vector space $V$ is one-dimensional. Its elements are sometimes called pseudoscalars, and they represent oriented $n$-dimensional volume elements.

A linear operator $f$ on $V$ can be extended to a linear map on the exterior algebra according to the rules $f(\alpha) = \alpha$ for $\alpha$ a scalar and $f(A \wedge B) = f(A) \wedge f(B), f(A + B) = f(A) + f(B)$ for $A$ and $B$ blades of arbitrary grade. Trivia: some authors call this extension an outermorphism. The extended map will be grade-preserving; that is, if $A$ is a homogeneous element of the exterior algebra of grade $m$, then $f(A)$ will also have grade $m$. (This can be verified from the properties of the extended map I just listed.)

All this implies that a linear map on the exterior algebra of $V$ once restricted to the top exterior power reduces to multiplication by a constant: the determinant of the original linear transformation. Since pseudoscalars represent oriented volume elements, this means that the determinant is precisely the factor by which the map scales oriented volumes.

Zach Conn
  • 4,763
  • 1
  • 22
  • 29
24

There are excellent answers here that are very detailed.

Here I provide a simpler answer, also discussed in wikipedia. Think of the determinant as the area (in 2D; in 3D it would be the volume, etc.) of the parallelogram made by the vectors:

Parallelogram from vectors

Keep in mind that the area of a parallelogram is the base $\times$ height. Doing some tricks with the dot product, this yields the determinant:

$$ \begin{vmatrix} a & b \\ c & d \end{vmatrix} = ad - bc = Area_{parallelogram} $$

You can place the unit vectors for each dimension to test the identity matrix by seeing that: $$ \begin{vmatrix} 1 & 0 \\ 0 & 1 \end{vmatrix} = ad - bc = 1 \times 1 - 0 \times 0 = 1 $$

This is a volume with a 3 by 3 matrix and will be equal to 1 in all cases as the off diagonal elements remove any effect from the only value contributing to the volume as the diagonal product of 1s. It is understood in some contexts that the coordinate system is unmodified.

Thinking in these terms, I also find it easier to think about singular matrices: not being able to take the inverse of a matrix with a 0 determinant now "feels like" trying to divide by 0, since I can think of the determinant as the "scalar value" of the matrix. This may not help others, but if it helps you, great!

Mike Williamson
  • 483
  • 3
  • 11
  • I was brushing up on my linear algebra via Khan Academy, and came across a [wonderful lesson](https://www.khanacademy.org/math/linear-algebra/matrix-transformations/determinant-depth/v/linear-algebra-determinant-and-area-of-a-parallelogram) going through the boring/gory details of how this area-of-parallelogram calculation turns out. Check it out, for anyone who wants a better understanding! – Mike Williamson Jul 25 '19 at 02:10
  • 4
    A picture/visualization is worth a thousand words (like the answers above). – Hezi Jul 31 '19 at 13:26
17

If you have a matrix

  • $H$ then you can calculate the correlationmatrix with
  • $G = H \times H^H$ (H^H denotes the complex conjugated and transposed version of $H$).

If you do a eigenvalue decomposition of $G$ you get eigenvalues $\lambda$ and eigenvectors $v$, that in combination $\lambda\times v$ describes the same space.

Now there is the following equation, saying:

  • Determinant($H*H^H$) = Product of all eigenvalues $\lambda$

I.e., if you have a $3\times3$ matrix $H$ then $G$ is $3\times3$ too giving us three eigenvalues. The product of these eigenvalues give as the volume of a cuboid. With every extra dimension/eigenvalue the cuboid gets an extra dimension.

Marc van Leeuwen
  • 107,679
  • 7
  • 148
  • 306
Fabian Schuh
  • 295
  • 2
  • 4
14

(I considered making this a comment, but I thought it might deserve more attention than a comment would receive. Upvotes and downvotes will tell if I am right or wrong).

Complement about the sign of the determinant

I loved the accepted answer by Jamie, but I was frustrated that it did not give more explanation about the sign of the determinant and the notion of "rotation" or "orientation" of a vector. The answer from Marc Van Leeuwen comments more on this, but maybe not be enough for everyone -- at least not for me -- to understand what it means for a matrix to change the orientation of the space it transforms. So I googled the issue and ended up on the following explanation which I find excellent and accessible:

http://mathinsight.org/determinant_linear_transformation#lintrans3D

Martin Van der Linden
  • 2,384
  • 1
  • 14
  • 30
12

While there are already some excellent answers, I think there is one aspect that is not yet adequately covered. Namely given that the matrix can be considered as representation of a linear transformation in a given basis, what does the determinant of the matrix tell us about a given transformation?

Assume we have a shape in our vector space, any shape, with the only restriction that is has a well-defined volume. Now we may ask, what does a given linear transformation to the volume of that shape?

Well, the first thing we notice is that if we take a direction, any direction, and stretch the shape along that direction with a positive factor while leaving all orthogonal directions unchanged, the volume will also be multiplied with that factor. Also if we “stretch” the shape with factor $0$ (making it flat), it clearly will have volume $0$ afterwards, so that rule also extends nicely to this boundary case.

Furthermore, if we rotate the shape (or leave it as it is), the volume won't change either. Note that not changing the volume means multiplying the volume by one.

Note that all the above doesn't depend on the shape, but is all a property of the transformation alone. Therefore it makes sense to assign to each such transformation $T$ a function, let's call it $\det T$, which tells us the factor we have to apply to a volume to get the volume of the image.

Of course if we do several such transformations in a row, and each one multiplies the volume with a certain factor, then the factors multiply as well. That is, $$\det(T_1T_2) = (\det T_1)(\det T_2).$$

Now looking closer at the above, we see that we have not yet covered all possible transformations. We covered all those transformations that can be done by combinations of stretching and rotating, but we don't yet know what to do when mirroring. Let's consider the specific case of mirroring in one direction, that is, reversing the sign of one direction and keeping everything else. Let's call that mirroring transformation $M$.

Well, on first view, it seems obvious what to do: The mirroring does not change the volume of any shape, therefore $\det M=1$, right? But then, we note that when we write doen $M$, it really is stretching with the factor $-1$. Since we are always multiplying, that factor $-1$ can always be gotten rid of by applying the absolute value at the end. But does the factor actually make sense geometrically?

Well, there are a lot of shapes that are not identical to their mirror image, and it turns out that if you want to continuously transform them into their mirror image through linear transformations, you always have to pass through a shape with volume $0$. So the sign indeed carries geometric information, so it also makes geometric sense to keep it.

Since all linear transformations can be obtained by sequences of one-dimensional stretching, rotations and one-dimensional mirror transformations, we now have completely determined the value of $\det T$ for any transformation. It is also intuitively clear that it is well-defined (if we achieve the same transformation in different ways, it still will affect the volume of shapes in the very same way).

Now that we have defined the effect on the transformation, we can look at what it means for the matrix.

Obviously, a diagonal matrix is the product of stretches/mirrors in the coordinate directions, therefore the determinant of a diagonal matrix is simply the product of its diagonal entries. Exchanging two columns or rows of the matrix means mirroring in the corresponding diagonal direction before or after applying the original transformation, therefore it gives a factor $-1$. If the matrix is non-invertible (the columns are linearly dependent), the image will have zero volume, therefore the determinant is $0$. And the standard basis vectors are mapped to the columns of the vector, therefore the unit cube spanned by the basis vectors will be mapped to the parallelepiped spanned by the column vectors, whose volume will therefore be given by $|\det A|$.

celtschk
  • 41,315
  • 8
  • 68
  • 125
  • this is by far the more intuitive explanation of what a determinant is. The trickier part is about the orientation of a transformation, what can make the resulting volume negative. +1 – Masacroso Jul 30 '20 at 12:45
9

Think about a scalar equation, $$ax = b$$ where we want to solve for $x$. We know we can always solve the equation if $a\neq 0$, however, if $a=0$ then the answer is "it depends". If $b\neq 0$, then we cannot solve it, however, if $b=0$ then there are many solutions (i.e. $x \in \mathbb{R}$). The key point is that the ability to solve the equation unambiguously depends on whether $a=0$.

When we consider the similar equation for matrices

$$\mathbf{Ax} = \mathbf{b}$$

the question as to whether we can solve it is not so easily settled by whether $\mathbf{A}=\mathbf{0}$ because $\mathbf{A}$ could consist of all non-zero elements and still not be solvable for $\mathbf{b}\neq\mathbf{0}$. In fact, for two different vectors $\mathbf{y}_1 \neq \mathbf{0}$ and $\mathbf{y}_2\neq \mathbf{0}$ we could very well have that

$$\mathbf{Ay}_1 \neq \mathbf{0}$$ and $$\mathbf{Ay}_2 = \mathbf{0}.$$

If we think of $\mathbf{y}$ as a vector, then there are some directions in which $\mathbf{A}$ behaves like non-zero (this is called the row space) and other directions where $\mathbf{A}$ behaves like zero (this is called the null space). The bottom line is that if $\mathbf{A}$ behaves like zero in some directions, then the answer to the question "is $\mathbf{Ax} = \mathbf{b}$ generally solvable for any $\mathbf{b}$?" is "it depends on $\mathbf{b}$". More specifically, if $\mathbf{b}$ is in the column space of $\mathbf{A}$, then there is a solution.

So is there a way that we can tell whether $\mathbf{A}$ behaves like zero in some directions? Yes, it is the determinant! If $\det(\mathbf{A})\neq 0$ then $\mathbf{Ax} = \mathbf{b}$ always has a solution. However if, $\det(\mathbf{A}) = 0$ then $\mathbf{Ax} = \mathbf{b}$ may or may not have a solution depending on $\mathbf{b}$ and if there is one, then there are an infinite number of solutions.

Tpofofn
  • 4,451
  • 1
  • 22
  • 29
8

One way to treat define the determinant which makes clear the relation between all the various notions you mentioned is as follows:

Given a vector space $E$ of dimmension $n$ over the field $K$ and a basis $B=(b_1,...,b_n)$ of $E$, the determinant is the unique (nonzero) alternating multilinear $n$-form $\phi$ of $E$ which satisfies $\phi(b_1,...,b_n)=1$.

This simply means that the determinant is a function $\phi$ which takes a tuple $(x_1,...,x_n)$ of $n$ vectors of $E$ and returns a scalar from the field $K$, such that

(1) $\phi$ is linear in each of the $n$ variables $(x_1,...,x_n)$ (it's "multilinear")

(2) if two of the $x_i$'s are equal, then $\phi(x_1,...,x_n)=0$ ($\phi$ is "alternating")

(3) It turns out that the set of functions $\phi$ satisfying the two above properties are all multiples of eachother. So we choose a basis $B$ of $e$ and say that the determinant is the function $\phi$ satisfying the above properties which maps $B$ to $1$.

Of course it's not immediately obvious that such a function $\phi$ exists and is unique!

To simplify slightly we will take the vector space $E$ to be $K^n$ and the basis $B$ to be the canonical basis.

It turns out that the determinant satisfies the miraculous property that $det(x_1,...,x_n) \neq 0$ if and only if $(x_1,...,x_n)$ is a basis.

Now... given $n$ vectors $x_1,...,x_n$ such that for the coordinates in the basis $B$ of $x_i$ are $(a_{i,1},...,a_{i,n})$, the determinant of the $n$-vectors $x_1,...,x_n$ can be shown to be equal to

$\sum_{\sigma \in S_n} sgn(\sigma)a_{1,\sigma(1)}...a_{n,\sigma(n)}$

which should be familiar to you as the expression for the determinant in terms of permutations. Here $S_n$ is the symmetric group, i.e the set of permutations of $\{1,2,..,n\}$ and $sgn(\sigma)$ is the signature of the permutation $\sigma$.

To make the link between the determinant of a set of $n$ vectors to the determinant of a matrix, just note that the matrix $A=(a_{i,j})$ is exactly them matrix whose column vectors are $x_1,...,x_n$.

Thus when we take the determinant of a matrix, what we are really doing is evaluating a function in terms of the $n$ column vectors. We said earlier that this function is nonzero if and only if the $n$ vectors form a basis - in other words, if and only if the matrix is of full rank, i.e iff its invertible.

So the abstract definition the determinant as a function which maps a set of vectors to the scalar field (while obeying some nice properties like linearity) is equivalent to a function from matrices to the scalar field which is nonzero exactly when the matrix is invertible. Moreover, this function turns out to be multiplicative! (Consequently, the restriction of this function to the set of invertible matrices gives is a group homomorphism from $(Gl_n(K), \times)$ to $(K/\{0\},*)$.

The expression of the determinant of a matrix in terms of permutations can be used to derive many of the nice properties you are familiar with, for example

  • a matrix and its transpose have the same det

  • det of a triangular matrix is the product of the diagonal elements

  • the Laplace-formula a.k.a cofactor expansion which tells you how to calculate the determinant in terms of a weighted sum of determinants of submatrices:

$\det(A)=\sum_{i=1}^{n}(-1)^{i+j}a_{i,j}\Delta_{i,j}$

where $\Delta_{i,j}$ is the determinant of the matrix obtained from $A$ by removing the row $i$ and the column $j$, known as the minor $(i,j)$.

Joshua Benabou
  • 5,503
  • 17
  • 32
  • 1
    this is the cleanest approach, however far from intuitive in relation to it use in linear algebra. +1 anyway – Masacroso Jul 30 '20 at 12:42
8

Imagine a completely general system of equations

$$a_{11} x_1 + a_{12} x_2 + a_{13} x_3 = b_1$$ $$a_{21} x_1 + a_{22} x_2 + a_{23} x_3 = b_2$$ $$a_{31} x_1 + a_{32} x_2 + a_{33} x_3 = b_3$$

If we solve for the variables $x_i$ in terms of the other variables and write the results in lowest terms, we'll see that the expressions for each $x_i$ all have the same functions of $a_{ij}$ in the denominator. (Say that we work over the integers.) This expression is (up to a unit) the determinant of the system.

If you pick some systematic way of solving $n \times n$ systems, say Gaussian elimination, you can use it to crank out a formula for this determinant.

I think this is a lot more natural than the other approaches because you start with something straightforward and common like a system of linear equations, then you put your head down and solve it, and out pops this notion.

Of course this only gives you the answer up to a sign, but this actually makes sense, because there's an arbitrary choice of sign going in.

Garibaldi has a paper that presents this approach and some related ones, entitled The determinant and characteristic polynomial are not ad hoc constructions. (To formalize this you want to bring in a little ring theory so that you have formal notions of indeterminates and so forth.)

Daniel McLaury
  • 23,542
  • 3
  • 41
  • 105
2

I will try to explain this intuitively. But first you must understand certain concepts. I recommend 3b1b videos for intuition in "linear combinations". Anyways, it's not a difficult concept to understand, I will slightly introduce it.

First of all, let's start with an example and then try to generalize. So imagine we have the matrix $A=\left[ \begin{matrix} 3 & 1 \\ 1.5 & 2 \end{matrix} \right]$.

Now let's take the column vectors of this matrix, $\left[ \begin{matrix} 3 \\ 1.5 \end{matrix} \right]$ and $\left[ \begin{matrix} 1 \\ 2 \end{matrix} \right]$. The linear combination of this vectors is what we call the Column Space - Col(A), all the possible combinations of this vectors. Graphically it looks something like this:

Also we have the Row Space - Row(A), identically, defined as the linear combinations of the row vectors $r_{1}=\left[ \begin{matrix} 3 & 1 \end{matrix} \right]$ and $r_{2}=\left[ \begin{matrix} 1.5 & 2 \end{matrix} \right]$. They can be represented graphically the same way as with Col(A).

enter image description here

So basically, the determinant is the area created by the parallelogram defined by the row vectors (column vector generates the same area, but for convention let's use the row vectors). In the image it's represented by the blue parallelogram. So, Area of Parallelogram $= Determinant(A) = Det(A)$.

So, how can we calculate this area? For understanding this part you should have basic knowledge in "row operations" and "area of a parallelogram".

Let's call "$r_{1}$" the first row and "$r_{2}$" the second row. One of the basic row operations consist on add to one row other row scaled. So imagine row operating on $r_{1}$ as $r_{1}'=r_{1}+kr_{2}$, $k$ any real number. Don't desperate if you don't understand why we are row operating, thing are going to be clear right away.

So, let's call B the new matrix generated after replacing $r_{1}$ by $r_{1}+kr_{2}$. So, $r_{1}'$ with a quotation mark is going to be the transformed version of $r_{1}$. What would happen to Row(A) and to Det(A)? See what happens to Row(B) and Det(B) when we changed $r_{1}$ to $r_{1}'=r_{1}+kr_{2}$ with different values for $k$:

So, we can see that $r_{1}'$ moves parallel to $r_{2}$ which is obvious because we are adding a scaled version of $r_{2}$ to $r_{1}$.

Assuming you have knowledge in "parallelogram areas", you can verified that the base and the height doesn't change. That means that the area stays constant when adding one row scaled to another row because we doesn't change the height never cause we move parallel. Thus Det(A)=Det(B).

So here's come the MAGICAL PART, we should find a $k$ such that we eliminate the y-component of $r_{1}$ row vector (y-component$=A_{12}=0$). So applying row operation with $k=\frac{1}{2}$ such that $A_{12}=0$, the transformed matrix would be:

$$\left[ \begin{matrix} 3 & 1 \\ 1.5 & 2 \end{matrix} \right] \xrightarrow{r_1-\frac{1}{2}r_2} \left[ \begin{matrix} 2.25 & 0 \\ 1.5 & 2 \end{matrix} \right]$$

So our matrix B have a triangular form, Row(B) looks like:

enter image description here

Now we have a parallelogram with base length $=2.25$ and height length $=2$. Thus by definition of parallelogram area, we have that $Det(A)=Det(B)=2.25*2=4.5$. So the determinant is just the product of the diagonal elements of the triangular matrix form, we call it the echelon form. MAGIC

We could search for a rectangle that have the same area as Det(A) by repeating this process but this time applying the row operation to $r_{2}$ such that we eliminate it's x-component (x-component$=A_{21}=0$) such that we get a rectangle with area Det(B) that have the equivalent area as Det(A), but this is completely unnecessary since it doesn't change the base and height of the parallelogram. Anyways for intuition of this proccess $r_{2}'=r_{2}+kr_{1}'$ would look like:

So $k=-\frac{2}{3}\approx-0.66$.

$$\left[ \begin{matrix} 2.25 & 0 \\ 1.5 & 2 \end{matrix} \right] \xrightarrow{r_{2}-\frac{2}{3}r_{1}'} \left[ \begin{matrix} 2.25 & 0 \\ 0 & 2 \end{matrix} \right]$$

We have that the base is $2.25$ and the height is $2$, so the area of the rectangle is $Det(B)=4.5=Det(A)$. The determinant is just the product of the diagonal elements of the diagonal matrix.

So we've seen that the product of the diagonal elements of a converted matrix in triangular form gives us the determinant of the matrix. Why triangular form? Imagine $x_{i}$ being the $i$ dimension, so every row vector in echelon form of the matrix adds a new component to the $i$ dimension, so in geometric terms it adds a height to the dimension.

The great think of this technique is that it can be applied to any n-dimensions and maintains intuition of what you are doing. I would like to present the 3-d graphical proof but it would be a lot of work that I think you could do with a little of imagination.

I try to make an intuitive and geometrically process using this Bibliography:

maenju
  • 145
  • 6
1
The determinant of a matrix gives the signed volume of the parallelepiped
that is generated by the vectors given by the matrix columns.

You can find a very pedagogical discussion at page 16 of

A Visual Introduction to Differential Forms and Calculus on Manifolds Fortney, J.P.

google book link, click on "1 Background Material"

Given a parallelepiped whose edges are given by $ v_1 , v_2 , \dots, v_n \in \mathbb{R}^n $. Then if you accept these 3 properties:

  1. $D(I)=1$, where $I=[e_1,e_2,\dots,e_n]$ (identity matrix)
  2. $D(v_1,v_2,\dots,v_n)=0$ if $v_i=v_j$ for any $i\neq j$
  3. $D$ is linear, $$\forall j,\ D(v_1,\dots,v_{j-1},v+cw,v_{j+1},\dots,v_n)=D(v_1,\dots,v_{j-1},v,v_{j+1},\dots,v_n)+cD(v_1,\dots,v_{j-1},w,v_{j+1},\dots,v_n)$$

you can show that $D$ is the parallelepiped signed volume and that $D$ is the determinant.

Picaud Vincent
  • 2,681
  • 11
  • 14
0

Let $A _1, \ldots, A _n \in \mathbb{F} ^n$ be linearly independent (and hence a basis). So for any $b \in \mathbb{F} ^n, $ there exist unique $x _1, \ldots, x _n$ with $x _1 A _1 + \ldots + x _n A _n = b.$
But its not clear what the explicit values of $x _i$s (in terms of $A _i$s and $b$) are.

For any linear map $T : \mathbb{F} ^n \to \mathbb{F},$ $x _1 T(A _1) + \ldots + x _n T(A _n) = T(b).$
So if we can specify (explicitly) a linear map $T _1 : \mathbb{F} ^n \to \mathbb{F}$ with $T _1(A _2) = \ldots = T _1 (A _{n}) = 0$ and $T _1 (A _1) \neq 0,$ $x _1$ can be computed as $x _1 = \frac{T _1(b)}{T _1(A _1)}.$
In general if we specify linear maps $T _1, \ldots, T _n : \mathbb{F} ^n \to \mathbb{F}$ with $T _i (A _j) = 0$ for $i \neq j$ and $T _i (A _i) \neq 0,$ the $x _i$s can be computed as $x _i = \frac{T _i(b)}{T _i(A _i)}.$

So if we somehow construct a multilinear map $f : \mathbb{F} ^n \times \ldots \times \mathbb{F} ^n \to \mathbb{F}$ where i) $f(v _1, \ldots, v _n) = 0$ if any two arguments are equal, and ii) $f(v _1, \ldots, v _n) \neq 0$ whenever $v _1, \ldots, v _n$ are linearly independent, we'll be done

By taking $T _j : \mathbb{F} ^n \to \mathbb{F},$ $$\text{ } T _j(v) = f(A _1, \ldots, \underbrace{v} _{j ^{th} \text{ pos.}}, \ldots, A _n).$$


Turns out such a construction is possible, and unique upto multiplication by nonzero scalars. Subject to normalising constraint $f(e _1, \ldots, e _n) = 1,$ we get a unique map, called the determinant.