48

We often learn in a standard linear algebra course that a determinant is a number associated with a square matrix. We can define the determinant also by saying that it is the sum of all the possible configurations picking an element from a matrix from different rows and different columns multiplied by (-1) or (1) according to the number inversions.

But how is this notion of a 'determinant' derived? What is a determinant, actually? I searched up the history of the determinant and it looks like it predates matrices. How did the modern definition of a determinant come about? Why do we need to multiply some terms of the determinant sum by (-1) based on the number of inversions? I just can't understand the motivation that created determinants. We can define determinants, and see their properties, but I want to understand how they were defined and why they were defined to get a better idea of their important and application.

Rodrigo de Azevedo
  • 18,977
  • 5
  • 36
  • 95
user8210
  • 973
  • 2
  • 11
  • 13
  • 2
    the determinant defines volume in n-dimensions. Munkrese Analysis on Manifolds text has a nice discussion. I'm not well-versed in the history you seek so I leave it to others! – James S. Cook Sep 12 '12 at 07:14
  • 2
    Related question http://math.stackexchange.com/questions/668/whats-an-intuitive-way-to-think-about-the-determinant/. See the answers given there. – Marc van Leeuwen Sep 12 '12 at 07:20
  • Possible duplicate of http://math.stackexchange.com/questions/81521/development-of-the-idea-of-the-determinant. – lhf Oct 21 '13 at 12:27
  • With so many beautiful ways of looking at the determinant, I'm afraid students won't realize that you can't help stumbling over the determinant when solving a 2 by 2 or 3 by 3 linear system of equations by hand. Just solve $a_{11} x_1 + a_{12} x_2 = b_1, a_{21} x_1 + a_{22} x_2 = b_2$ by hand and you find yourself needing to divide by the determinant. So you see that if the determinant is nonzero then you will have a unique solution. That is a very easy way to discover the determinant. – littleO Aug 07 '20 at 20:59

3 Answers3

27

I normally have two ways of viewing determinants without appealing to higher-level math like multilinear forms.

The first is geometric, and I do think that most vector calculus classes nowadays should teach this interpretation. That is that, given vectors $v_1, \ldots, v_n \in \mathbb{R}^n$ dictating the sides of an $n$-dimensional parallelepiped, the volume of this parallelepiped is given by $\det(A)$, where $A = [v_1 \ldots v_n]$ is the matrix whose columns are given by those vectors. We can then view the determinant of a square matrix as measuring the volume-scaling property of the matrix as a linear map on $\mathbb{R}^n$. From here, it would be clear why $\det(A) = 0$ is equivalent to $A$ not being invertible - if $A$ takes a set with positive volume and sends it to a set with zero volume, then $A$ has some direction along which it "flattens" points, which would precisely be the null space of $A$. Unfortunately, I'm under the impression that this interpretation is at least semi-modern, but I think this is one of the cases where the modern viewpoint might be better to teach new students than the old viewpoint.

The old viewpoint is that the determinant is simply the result of trying to solve the linear system $Ax = b$ when $A$ is square. This is most likely how the determinant was first discovered. To derive the determinant this way, write down the generic matrix and then proceed by Gaussian elimination. This means you have to choose nonzero leading entries in each row (the pivots) and use them to eliminate subsequent entries below. Each time you eliminate the rows, you have to multiply by a common denominator, so after you do this $n$ times, you'll end up with the sum of all the permutations of entries from different rows and columns merely by virtue of having multiplied out to get common denominators. The $(-1)^k$ sign flip comes from the fact that at each stage in Gaussian elimination, you're subtracting. So on the first step you're subtracting, but on the second step you're subtracting a subtraction, and so forth. At the very end, by Gaussian elimination, you'll obtain an echelon form (upper triangular), and one knows that if any of the diagonal entries are zero, then the system is not uniquely solvable; the last diagonal entry will precisely be the determinant times the product of the values of previously used pivots (up to a sign, perhaps). Since the pivots chosen are always nonzero, then it will not affect whether or not the last entry is zero, and so you can divide them out.

EDIT: It isn't as simple as I thought, though it will work out if you keep track of what nonzero values you multiply your rows by in Gaussian elimination. My apologies if I mislead anyone.

Christopher A. Wong
  • 21,201
  • 3
  • 44
  • 79
  • 3
    also, we should emphasize the sign of $det[v_1|v_2|...|v_n]$ reveals the handedness or orientation of the set $\{ v_1,v_2,\dots v_n \}$ – James S. Cook Sep 12 '12 at 07:36
  • 4
    Did you ever try actually performing Gaussian elimination on a _generic_ matrix (with all entries independent unknowns)? Try it for a $3\times3$ matrix! It doesn't really work as you advertised, and you'll have a hard time actually making (just) the determinant appear in the computations. You can find something like this done to prove [Cramer's rule](http://en.wikipedia.org/wiki/Cramer%27s_rule#Proof), but is is _not_ usual Gaussian elimination, and it assumes the determinant is already known. – Marc van Leeuwen Sep 12 '12 at 07:36
  • 1
    @MarcvanLeeuwen, you're right, I forgot that actually the final diagonal entry will be the determinant multiplied by the value of the first pivot. But since WLOG the first pivot must be a nonzero value, then it can be divided. I don't think it's as hard to manipulate into the determinant form as one might think. – Christopher A. Wong Sep 12 '12 at 07:48
  • 1
    BTW, I just tried it for the $3 \times 3$ case, and it ends up being $a_{11} \det(A)$ for the last diagonal entry, as expected. Note that if $a_{11} = 0$, then we just swap rows until WLOG $a_{11} \neq 0$. The nice thing about swapping rows in Gaussian elimination not affecting the determinant is that it shows why, on some level, the determinant must be permutation-invariant. – Christopher A. Wong Sep 12 '12 at 08:09
  • @ChristopherA.Wong In fact it is unclear to me what you mean by "first pivot". With all entries unknown, there isn't a single (non-constant) expression that is assured to be nonzero. So you need to multiply rows by factors that are not known to be nonzero, and these factors will remain in (the determinant of) your matrix. I can see how you get a factor $a_{1,1}$, but not how you avoid introducing even nastier factors in the sequel. I'm stuck with $\begin{pmatrix}a_1&a_2&a_3\\0&a_1b_2-b_1a_2&a_1b_3-b_1a_3\\0&a_1c_2-c_1a_2&a_1c_3-c_1a_3\end{pmatrix}$. – Marc van Leeuwen Sep 12 '12 at 08:25
  • @MarcvanLeeuwen You're actually right that it is not as simple as I thought it was before, and the final entry will be the determinant multiplied by many of the nonzero factors used as pivots. However, for the $3 \times 3$ case it works a little more easily; using your matrix, if you do Gaussian elimination again, you get $$\begin{bmatrix} a_1 & a_2 & a_3 \\ 0 & a_1 b_2 - b_1 a_2 & a_1 b_3 - b_1 a_3 \\ 0 & 0 & (a_1c_3 - c_1 a_3)(a_1 b_2 - b_1 a_2) - (a_1 b_3 - b_1a_3)(a_1 c_2 - c_1 a_2) \end{bmatrix}.$$ – Christopher A. Wong Sep 12 '12 at 08:45
  • 1
    Expanding the last term, we get $a_1^2 b_2 c_3 - a_1 a_2 b_1 c_3 - a_1 a_3 b_2 c_1 + a_1^2 b_3 c_2 + a_1 a_2 b_3 c_1 - a_1 a_3 b_1 c_2 $, which is precisely equal to $a_1 \det(A)$. – Christopher A. Wong Sep 12 '12 at 08:49
  • You're right; I think I just didn't have the courage to do that... ;-) – Marc van Leeuwen Sep 12 '12 at 08:56
  • all these years I've been using a determinant, never knowing where it came from. And it turns out it's just the result of a Gaussian elimination restricted to not being 0 on diagonal... Thank you! – Maverick Meerkat Jul 04 '19 at 16:52
17

The determinant was originally `discovered' by Cramer when solving systems of linear equations necessary to determine the coefficients of a polynomial curve passing through a given set of points. Cramer's rule, for giving the general solution of a system of linear equations, was a direct result of this.

This appears in Gabriel Cramer, ``Introduction a l'analyse des lignes courbes algebriques,''(Introduction to the analysis of algebraic line curves), Geneve, Ches les Freres Cramer & Cl. Philibert, (1750). It is cited as a footnote on p. 60, which reads (from French):

``I think I have found [for solving these equations] a very simple and general rule, when the number of equations and unknowns do not pass the first degree [e.g. are linear]. One finds this in the Appendix No. 1.'' Appendix No. 1 appears on p. 657 of the same text. The text is available on line, for those who can read French.

The history of the Determinant appears in Thomas Muir, ``The Theory of Determinants in the Historical Order of Development,'' Dover, NY, (1923). This is also available on line.

Jeff Simmons
  • 171
  • 1
  • 2
11

I do not know the actual history of determinant, but I think it is very well motivated. From the way I look at it, it's actually those properties of determinant that make sense. Then you derive the formula from them.

Let me start by trying to define the "signed volume" of a hyper-parallelepiped whose sides are $(u_1, u_2, \ldots, u_n)$. I'll call this function $\det$. (I have no idea why it is named "determinant". Wiki says Cauchy was the one who started using the term in the present sense.) Here are some observations regarding $\det$ that I consider quite natural:

  1. The unit hypercube whose sides are $(e_1, e_2, \ldots, e_n)$, where $e_i$ are standard basis vectors of $\mathbb R^n$, should have volume of $1$.
  2. If one of the sides is zero, the volume should be $0$.
  3. If you vary one side and keep all other sides fix, how would the signed volume change? You may think about a 3D case when you have a flat parallelogram defined by vectors $u_1$ and $u_2$ as a base of a solid shape, then try to extend the "height" direction by the third vector $u_3$. What happens to the volume as you scale $u_3$? Also, consider what happens if you have two height vectors $u_3$ and $\hat u_3$. $\det(u_1, u_2, u_3 + \hat u_3)$ should be equal to $\det(u_1, u_2, u_3) + \det(u_1, u_2, \hat u_3)$. (This is where you need your volume function to be signed.)
  4. If I add a multiple of one side, say $u_i$, to another side $u_j$ and replace $u_j$ by $\hat u_j = u_j + c u_i$, the signed volume should not change because the addition to $u_j$ is in the direction of $u_i$. (Think about how a rectangle can be sheered into a parallelogram with equal area.)

With these three properties, you get familiar properties of $\det$:

  1. $\det(e_1, \ldots, e_n) = 1$.
  2. $\det(u_1, \ldots, u_n) = 0$ if $u_i = 0$ for some $i$.
  3. $\det(u_1, \ldots, u_i + c\hat u_i, \ldots, u_n) = \det(u_1, \ldots, u_i, \ldots, u_n) + c\det(u_1, \ldots, \hat u_i, \ldots, u_n)$.
  4. $\det(u_1, \ldots, u_i, \ldots, u_j, \ldots, u_n) = \det(u_1, \ldots, u_1, \ldots, u_j + cu_i, \ldots, u_n)$. (It may happen that $j < i$.)

You can then derive the formula for $\det$. You can use these properties to deduce further easier-to-use (in my opinion) properties:

  • Swapping two columns changes the sign of $\det$.

This should tell you why oddness and evenness of permutations matter. To actually (inefficiently) compute the determinant $\det(u_1, u_2, \ldots, u_n)$, write $u_i$ as $u_i = \sum_{j=1}^n u_{ij}e_j$, and expand by multilinearity. For example, in 2D case,

$$ \begin{align*} \det(u, v) & = \det(u_1e_1 + u_2e_2, v_1e_1 + v_2e_2) \\ & = u_1v_1\underbrace{\det(e_1, e_1)}_0 + u_1v_2\underbrace{\det(e_1, e_2)}_1 + u_2v_1\underbrace{\det(e_2, e_1)}_{-1} + u_2v_2\underbrace{\det(e_2, e_2)}_0 \\ & = u_1v_2 - u_2v_1. \end{align*} $$

(If you are not familiar with multilinearity, just think of it as a product. Ignore the word $\det$ from the second line and you get a simple expansion of products. Then you evaluate "unusual product" between vectors $e_i$ by the definition of $\det$. Note, however, that the order is important, as $\det(u, v) = - \det(v, u)$.)

Tunococ
  • 9,843
  • 25
  • 36
  • 1
    The name originates from Gauss in Disquisitiones arithmeticae (1801) while discussing quadratic form. Full article: http://www-groups.dcs.st-and.ac.uk/history/HistTopics/Matrices_and_determinants.html – Imme22009 Jul 06 '16 at 21:04