47

Where does the definition of the determinant come from, and is the definition in terms of permutations the first and basic one? What is the deep reason for giving such a definition in terms of permutations?

$$ \text{det}(A)=\sum_{p}\sigma(p)a_{1p_1}a_{2p_2}...a_{np_n}. $$

I have found this one useful:

Thomas Muir, Contributions to the History of Determinants 1900-1920.

Rodrigo de Azevedo
  • 18,977
  • 5
  • 36
  • 95
Pekov
  • 985
  • 11
  • 21
  • 1
    It's a special case of a symmetric form. And many of the properties of a determinant are clear from this definition, i.e. it's convenient. – Henno Brandsma Jun 17 '16 at 09:46
  • 6
    @HennoBrandsma "Symmetric form"? What is that, please? I know the determinant is a multilineal *alternating* form. – DonAntonio Jun 17 '16 at 09:50
  • "det" should not be in italics. – Andreas Rejbrand Jun 17 '16 at 11:02
  • 4
    @AndreasRejbrand: You can suggest an edit to fix that if you feel strongly about it. – hmakholm left over Monica Jun 17 '16 at 11:25
  • 2
    An other definition is that the determinant is the volume of the hyperparallelograme generated by $[v_1,...,v_n]$ – user330587 Jun 17 '16 at 11:54
  • 1
    @user330587 Yes I know, but Seki Kowa and Gottfried Leibniz were the first mathematicians to give definition for determinant and by that time they were not familiar with other definitions like the one of the volume of the hyperparallelograme or those arising from the Galois' Group Theory with symmetric group of permutations etc. I wonder what were their reasonings to give a permutation-based definition! – Pekov Jun 17 '16 at 12:07
  • 1
    You can easily discover determinants by solving a generic 2 by 2 or 3 by 3 linear system of equations by hand. You will find yourself dividing by the determinant. The formula you gave just pops out. – littleO Jun 17 '16 at 12:57
  • 1
    If you have the time, I highly recommend viewing [this lecture](http://ocw.mit.edu/courses/mathematics/18-06-linear-algebra-spring-2010/video-lectures/lecture-18-properties-of-determinants/) as well as [the following one](http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/least-squares-determinants-and-eigenvalues/determinant-formulas-and-cofactors/) from MIT. I found them very illustrative. – user170231 Jun 17 '16 at 16:49
  • @TheGreatDuck Nice story! Who is the guy? – Pekov Jun 19 '16 at 06:00

5 Answers5

48

This is only one of many possible definitions of the determinant.

A more "immediately meaningful" definition could be, for example, to define the determinant as the unique function on $\mathbb R^{n\times n}$ such that

  • The identity matrix has determinant $1$.
  • Every singular matrix has determinant $0$.
  • The determinant is linear in each column of the matrix separately.

(Or the same thing with rows instead of columns).

While this seems to connect to high-level properties of the determinant in a cleaner way, it is only half a definition because it requires you to prove that a function with these properties exists in the first place and is unique.

It is technically cleaner to choose the permutation-based definition because it is obvious that it defines something, and then afterwards prove that the thing it defines has all of the high-level properties we're really after.

The permutation-based definition is also very easy to generalize to settings where the matrix entries are not real numbers (e.g. matrices over a general commutative ring) -- in contrast, the characterization above does not generalize easily without a close study of whether our existence and uniqueness proofs will still work with a new scalar ring.

hmakholm left over Monica
  • 276,945
  • 22
  • 401
  • 655
  • How are you defining "singular matrix" here? The induced linear map is not a bijection? – Mario Carneiro Jun 18 '16 at 00:06
  • 1
    @MarioCarneiro Using gaussian elimination you can find if the matrix is row equivalent to the identity matrix, and if that's the case then the matrix is invertible. The induced function is not a linear map, it's n-linear. It would be monumental for the information theorists, if the function was a bijection(enconding one matrix with one number). – aboat Jun 18 '16 at 04:30
  • 1
    This answer still does not answer OP's questions(including the ones in the comment section) on the origins of the determinant. It tells you why to use one definition in favor of the others. The OP might have desired to read an answer detailing the ideas involved when the determinant was brought into the literature. – aboat Jun 18 '16 at 04:36
  • @aboat Actually I said "singular matrix" not "noninvertible", but that also suggests a definition: if there exists an inverse matrix $AB=1$ then $A$ is nonsingular. By "induced linear map" I mean the function $x\mapsto Ax$ which is a linear function $\Bbb R^n\to\Bbb R^n$. I'm not sure what you mean by n-linear. You must have misunderstood me if you think I am identifying each matrix with a number, although there is nothing fundamentally surprising about that, since $\Bbb R^{n\times n}$ and $\Bbb R$ have the same cardinality. (Perhaps you thought I was talking about the determinant?) – Mario Carneiro Jun 18 '16 at 05:43
  • @MarioCarneiro: I use "singular" to describe exactly the same matrices as "non-invertible" does. Of course if you have defined "singular" as "determinant 0" and only later proved that those matrices are the ones that are not invertible, then things get cyclic here -- so please don't imagine that is the case. – hmakholm left over Monica Jun 18 '16 at 09:32
29

The amazing fact is that it seems matrices were developed to study determinants. I'm not sure, but I think the "formula" definition of the determinant you have there is known as the Leibnitz formula. I am going to quote some lines from the following source Tucker, 1993.:

Matrices and linear algebra did not grow out of the study of coefficients of systems of linear equations, as one might guess. Arrays of coefficients led mathematicians to develop determinants, not matrices. Leibnitz, co-inventor of calculus, used determinants in 1693 about one hundred and fifty years before the study of matrices in their own right. Cramer presented his determinant-based formula for solving systems of linear equations in 1750. The first implicit use of matrices occurred in Lagrange's work on bilinear forms in the late 18th century.

--

In 1848, J. J. Sylvester introduced the term "matrix," the Latin word for womb, as a name for an array of numbers. He used womb, because he viewed a matrix as a generator of determinants. That is, every subset of k rows and k columns in a matrix generated a determinant (associated with the submatrix formed by those rows and columns).

You would probably have to dig (historical texts, articles) to find out why exactly Leibnitz devised the definition, most probably he had some hunch/intuition that it could lead to some breakthroughs in understanding the underlying connection between coefficients and the solution of a system equations...

Christiaan Hattingh
  • 4,585
  • 13
  • 29
20

Hint:

Determinants appear in the solution of linear systems of equation, among others. If you permute the equations, the solution cannot change. Hence, the expression of a determinant must be insensitive to row permutations, and this is why they are a combination of terms involving $a_{ip_i}$.

This explains the pattern $$\sum_p \sigma_p\prod_i a_{ip_i},$$ where the operators are commutative and imply multilinearity of the expression. Also, the form must be antisymmetric so that two equal rows yield a zero determinant (causing failure of the solution) and this explains why $\sigma_p=\pm1$ indicates the parity of the permutation.

  • I do like this perspective, but I'm not sure the question is the sort that admits a "hint" (maybe I'm just being naive). It is a good motivation though! – pjs36 Jun 17 '16 at 17:49
  • @pjs36: IMO, the technical answers based on intense formulas are those that answer the less "why" questions. –  Dec 01 '19 at 10:14
11

Here is a natural path to the idea of determinant (though this is not how they were originally developed).

An alternating $k$-linear function on a vector space $V$ over a field $\Bbb F$ is a map $f\,:\, V^k \to \Bbb F$ which is

  • Linear in each argument: $$f(v_1, \ldots, v_{i-1}, av_i + bw_i, v_{i+1}, \ldots, v_k) = af(v_1, \ldots, v_i, \ldots, v_k) + bf(v_1, \ldots, w_i, \ldots, v_k)$$ for all $i$.
  • Changes sign under exchange of any two arguments: $$f(v_1, \ldots, v_i, \ldots, v_j, \ldots v_k) = -f(v_1, \ldots, v_j, \ldots, v_i, \ldots v_k)$$ for all $i \ne j$

It is easy to see that if $f,g$ are two alternating $k$-linear functions on $V$, then so is $af + bg$ for any $a,b \in \Bbb F$, so the alternating $k$-linear functions on $V$ form another vector space $A^k(V)$. Some development shows that if $V$ has dimension $n$, then $A^k(V)$ has dimension $n \choose k$. In particular $A^n(V)$ has dimension $1$.

Now if $M\,:\,V \to V$ is linear and if $f\in A^k(V)$, then the map $$M_kf\,:\, V^k \to \Bbb F\,:\,(v_1, ... v_k) \mapsto f(Mv_1, ..., Mv_k)$$ is also alternating $k$-linear. And clearly $M_k(af+bg) = aM_kf + bM_kg$, so $M_k$ defines a linear map from $A^k(V)$ to itself (i.e., an endomorphism of $A^k(V)$).

Since $A^n(V)$ is one dimensional, any endomorphism is just multiplication by some element of the field $\Bbb F$. Thus we define the determinant of $M$ to be the unique element $\det(M) \in \Bbb F$ such that $$M_nf = \det(M)f\text{ for all }f \in A^n(V)$$

All the properties of determinants, including the permutation formula can be developed from this. Certain properties of determinants that are difficult to prove from the Liebnitz formula are almost trivial from this definition. In particular that $\det(MN) = \det(M)\det(N)$.

There is a close connection between the space of alternating $k$-linear functions and the $k$-order wedge product of a space, so I could have very similarly developed the determinant based on the wedge product, but alternating $k$-linear functions are easier conceptually.

Paul Sinclair
  • 37,876
  • 2
  • 22
  • 60
11

I think Paul's answer gets the algebraic nub of the issue. There is a geometric side, which gives some motivation for his answer, because it isn't clear offhand why multilinear alternating functions should be important.

On the real line function of two variables (x,y) given by x-y gives you a notion of length. It really gives you a bit more than length because is a signed notion of length. It cares about the direction of the line from x to y and gives you positive or negative based on that direction. If you swap x and y you get the negative of your previous answer.

In R^n it is useful to have a similar function that is the signed volume of the parallelpiped spanned by n vectors. If you swap two vectors that reverse the orientation of the parellelpiped, so you should get the negative of the previous answer. From a geometric persepective, that is how alternating functions come into play.

The determinant of a matrix with columns v_1,... v_n calculates the signed volume of the parallelpiped given by the vectors v_1,.. v_n. Such a function is necessarily alternating. It is also necessarily linear in each variable separately, which can also be seen geometrically.

John Robertson
  • 1,212
  • 9
  • 10
  • This answer matches my realization that the alternating cross products/minors/determinate's all are measuring (in a loose sense) linear independence (or area's, also in a loose sense). You can build up successive equations for independence/dependence starting from the definition of linear independence. i.e. if x is linearly independent of y; then y is linearly independent of x. This starts off the symmetry/anti-symmetry chain. AFAIK the more elaborate usages are extensions/developments of those starting ideas. – rrogers Jun 21 '16 at 17:56