187

I have looked extensively for a proof on the internet but all of them were too obscure. I would appreciate if someone could lay out a simple proof for this important result. Thank you.

Rodrigo de Azevedo
  • 18,977
  • 5
  • 36
  • 95
JohnK
  • 5,730
  • 4
  • 25
  • 48
  • 2
    This has a detailed proof: https://www.adelaide.edu.au/mathslearning/play/seminars/evalue-magic-tricks-handout.pdf – Anurag May 15 '19 at 14:57
  • 2
    Is it true for non algebraically closed field? – user595419 Mar 07 '20 at 03:49
  • 2
    This is what I found online: Hopefully, it will help http://www.math.harvard.edu/~knill/teaching/math19b_2011/handouts/lecture28.pdf – Ramuel May 01 '19 at 19:07

8 Answers8

258

These answers require way too much machinery. By definition, the characteristic polynomial of an $n\times n$ matrix $A$ is given by $$p(t) = \det(A-tI) = (-1)^n \big(t^n - (\text{tr} A) \,t^{n-1} + \dots + (-1)^n \det A\big)\,.$$ On the other hand, $p(t) = (-1)^n(t-\lambda_1)\dots (t-\lambda_n)$, where the $\lambda_j$ are the eigenvalues of $A$. So, comparing coefficients, we have $\text{tr}A = \lambda_1 + \dots + \lambda_n$.

Ted Shifrin
  • 101,067
  • 4
  • 80
  • 135
  • 49
    The definition of the characteristic polynomial I learned is just only $\det{(A-tI)}$. Why the coefficient of the $t^{n-1}$ term is $-\text{tr }A$? – bfhaha Apr 08 '18 at 22:08
  • 29
    @bfhaha The coefficient of $t^{n-1}$ must come from $(a_{11}-t)\cdots (a_{nn}-t)$, because if we expand $A-\lambda I$ along a row $i$, we see that any term involving an off-diagonal element $[A-t I]_{ij}$ eliminates $a_{ii}-t$ and $a_{jj}-t$, and hence any such term does not involve $t^{n-1}$. Therefore the coefficient of $t^{n-1}$ must come from the product of the diagonal elements, and so the coefficient is $(-1)^{n-1}(a_{11}+\cdots+a_{nn})$. – Elias Aug 14 '18 at 07:34
  • Hello! My understanding of the characteristic polynomial is from the determinant itself. So how did one figure out this will be the same the second form i.e $p(t) = (t-\lambda_1)(t-\lambda_2)...$ – DuttaA Dec 30 '20 at 17:00
  • 3
    @DuttaA Because the eigenvalues $\lambda_i$ are by definition the roots of this polynomial. – Ted Shifrin Dec 30 '20 at 17:18
  • @Elias what does it mean to "expand along a row $i$"? I'm still not clear why the coefficient of $t^{n-1}$ is equal to the trace of $A$. – Max Sep 19 '21 at 08:32
  • @Max They mean to rewrite $\det(A-tI)$ in terms of the cofactors along a row of $A-tI$. This is called Laplace expansion of the determinant. – Michael L. May 03 '22 at 19:04
54

Let $A$ be a matrix. It has a Jordan Canonical Form, i.e. there is matrix $P$ such that $PAP^{-1}$ is in Jordan form. Among other things, Jordan form is upper triangular, hence it has its eigenvalues on its diagonal. It is therefore clear for a matrix in Jordan form that its trace equals the sum of its eigenvalues. All that remains is to prove that if $B,C$ are similar then they have the same eigenvalues.

vadim123
  • 81,238
  • 9
  • 112
  • 215
  • 25
    I would try to elaborate a bit. For every matrix $A$ there exists a non singular matrix $P$ such that $PAP^{-1} = J$ where $J$ has [Jordan canonical form](https://en.wikipedia.org/wiki/Jordan_normal_form). Now using $tr(ABC) = tr(CAB) = tr(BCA)$ (which is true whenever the products are defined), we obtain $tr(A) = tr(P^{-1}JP) = tr(PP^{-1}J) = tr(J) = \sum_i \lambda_i$ where $\lambda_i$ are the eigenvalues of $A$. – them Aug 14 '16 at 10:08
  • 1
    Very elegant :) Also such tool can be used to show that det(A) ofr any matrix A is the product of eigenvalues det(A). – Konstantin Burlachenko Jan 24 '18 at 13:32
  • @bruziuz can you please tell me how can I show that determinant of a matrix in Jordan form is product of its diagonal entries? – Abhay Oct 21 '19 at 17:00
  • 1
    @Abhay, you are posting a comment to an answer from 6 years ago. Fortunately, I am still around to answer it. The answer is: Jordan form is, among other things, upper triangular. All upper triangular matrices have their determinant as the product of the diagonal entries. This can be proved by recursively [Laplace expanding](https://en.wikipedia.org/wiki/Determinant#Laplace's_formula_and_the_adjugate_matrix) on the first column. – vadim123 Oct 21 '19 at 17:08
  • @vadim123 thank you, your answer to above post really helped me. Thanks again for clarifying the determinant bit. – Abhay Oct 21 '19 at 17:13
  • @Abhay det(P^-1 J P)=det(P^-1)det(J)det(P)=det(J) J is block diagonal and it's blocks has bi-diagonal form, but at least J is upper triangular. In particular, for any triangular matrix determinant is a product of all entries in diagonal. – Konstantin Burlachenko Jan 10 '20 at 08:50
30

I'll try to show it another way. We know that if we have a polynomial $x^n+b_{n-1} x^{n-1} + \dots +b_1 x+ b_0$, then $(-1)^{n-1} b_{n-1}$ is the sum of the roots of this polynomial. (So-called Vieta's formulas) In our case, the polynomial is $\det(tI-A)$ and we have $(-1)^{n-1} b_{n-1}=\lambda_1+\lambda_2+\dots+\lambda_n$.


$\def\S{\mathcal{S}_n}$ Let $\S$ denote all the permutations of the set $\{1,2,\dots,n\}$. Then by definition $$ \det M = \sum_{\pi\in\S} m_{1,\pi(1)} m_{2,\pi(2)} \dots m_{n,\pi(n)} \operatorname{sgn}\pi, $$ where $\operatorname{sgn}\pi$ is either $+1$ or $-1$ and it is $+1$ for the identity permutation (we don't need to know more now).


Consider $M=tI-A$. To get the power $t^{n-1}$ for a permutation, we need this permutation to choose at least $n-1$ diagonal elements, i.e., to have $\pi(i)=i$ for at least $n-1$ values of $i$. However, once you know the value of a permuation on $n-1$ inputs, you know the last one as well. This means, that to get the coefficient of $t^{n-1}$, we need to consider only the identity permutation.


So far we got that $b_{n-1}$ is the coefficient of $t^{n-1}$ in $(t-a_{1,1})(t-a_{2,2})\dots(t-a_{n,n})$ (this is the term of the sum above corresponding to the identity permutation). Therefore $(-1)^{n-1}b_{n-1} = a_{1,1}+a_{2,2}+\dots+a_{n,n}=\operatorname{Tr}A$.

yo'
  • 4,282
  • 13
  • 29
  • 1
    @Ioannis Sorry, but there's no more elementary proof than the one I and Ted provided. And I don't think I'm able to divide it into more elementary steps than I did here. Therefore I think that you can't be helped. – yo' Oct 31 '13 at 00:05
  • 2
    Btw, this is not more advanced than Ted's proof. This is exactly the same, just each of the steps is written out. – yo' Oct 31 '13 at 00:11
  • There is a very simple proof for diagonalizable matrices that utlises the properties of the determinants and the traces. I am more interested in understanding your proofs though and that's what I have been striving to do. – JohnK Oct 31 '13 at 00:14
21

Trace is preserved under similarity and every matrix is similar to a Jordan block matrix. Since the Jordan block matrix has its eigenvalues on the diagonal, its trace is the sum (with multiplicity) of its eigenvalues.

Neal
  • 30,623
  • 2
  • 61
  • 110
14

Here is another proof. First of all, by definition, we have that the characteristic polynomial of the $n\times n$ matrix $A=[a_{ij}]$ is given by $P_A(x)=\det(xI_n-A)$. Let $P_A(x)=x^n-b_1x^{n-1}+b_2x^{n-2}-\dots$. By Viete's formula the sum of eigenvalues is $b_1$. We have to prove that $b_1=\hbox{trace}(A)$. Substituting $x$ with $\frac{1}{x}$ for every real non-zero $x$ we get $$\det\left(\frac{1}{x}\left(I_n-xA\right)\right)=\frac{1-b_1x+b_2x^{2}-\dots}{x^n},$$ or equivalently $$\det(I_n-xA)=1-b_1x+b_2x^2-\dots$$ for any non-zero real $x$. But then the left side and right side polynomials from the above equation coincide for all real $x$. A short explanation: if for two polynomials $f$ and $g$ we have $f(x)=g(x)$ for any non-zero $x$, then the polynomial $h(x):=f(x)-g(x)$ has an infinity of zeroes, thus being the identically zero polynomial; it follows that $f(x)=g(x)$ for all $x$. Let's denote now $$f(x):=\det(I_n-xA)$$ and $$g(x):=1-b_1x+b_2x^2-\dots.$$ We have seen that $f$ and $g$ are equal functions (polynomials). We then have $f'(x)=g'(x)$ for all $x$. Obviously, $g'(x)=-b_1+2b_2x-\dots$, therefore $g'(0)=-b_1$. On the other side, from $$f(x)=\left|\begin{array}{cccc} 1-a_{11}x&-a_{12}x&\dots&-a_{1n}x\\ -a_{21}x&1-a_{22}x&\dots&-a_{2n}x\\ \dots&\dots&\dots&\dots\\ -a_{n1}x&-a_{n2}x&\dots&1-a_{nn}x\end{array}\right|$$ by the rule of differentiating determinants we get $$f'(x)=\left|\begin{array}{cccc} -a_{11}&-a_{12}&\dots&-a_{1n}\\ -a_{21}x&1-a_{22}x&\dots&-a_{2n}x\\ \dots&\dots&\dots&\dots\\ -a_{n1}x&-a_{n2}x&\dots&1-a_{nn}x\end{array}\right|+\dots+\left|\begin{array}{cccc} 1-a_{11}x&-a_{12}x&\dots&-a_{1n}x\\ -a_{21}x&1-a_{22}x&\dots&-a_{2n}x\\ \dots&\dots&\dots&\dots\\ -a_{n1}&-a_{n2}&\dots&-a_{nn}\end{array}\right|.$$ It follows that $$f'(0)=\left|\begin{array}{cccc} -a_{11}&-a_{12}&\dots&-a_{1n}\\ 0&1&\dots&0\\ \dots&\dots&\dots&\dots\\ 0&0&\dots&1\end{array}\right|+\dots+\left|\begin{array}{cccc} 1&0&\dots&0\\ 0&1&\dots&0\\ \dots&\dots&\dots&\dots\\ -a_{n1}&-a_{n2}&\dots&-a_{nn}\end{array}\right|=$$ $$=-(a_{11}+\dots+a_{nn})=-\hbox{trace}(A).$$ Since we have $f'(0)=g'(0)$ we get $b_1=\hbox{trace}(A)$.

JohnnyC
  • 1,170
  • 5
  • 14
9

Let $A \in M_{n}(\mathbb{C})$. Then $\text{tr}(A) = \text{tr}(PTP^{-1})$ where $T$ is an upper triangular matrix and $P$ is invertible$^{1}$. Thus $\text{tr}(A) = \text{tr}(PTP^{-1}) {=} \text{tr}(P^{-1}PT) = \text{tr}(T)$. The result follows since the diagonal entries of $T$ are the eigenvalues of $A$.

$^{1}$ The existence of matrices $T$ and $P$ follows from the fact that $\mathbb{C}$ is algebraically closed!

Ron
  • 585
  • 6
  • 16
1

By the Schur decomposition, any matrix $A$ is unitarily similar to an upper triangular matrix $T$. Being similar, $A$ and $T$ have the same trace and the same eigenvalues. Moreover, the diagonal entries of $T$ are equal to its eigenvalues (since $T$ is triangular). The stated result follows by calculating the trace of $T$. See https://www.statlect.com/matrix-algebra/properties-of-eigenvalues-and-eigenvectors.

user4422
  • 227
  • 1
  • 13
1

Let $\mathbf{A}$ be a $k \times k$ symmetric matrix and $\mathbf{x}$ be a $k \times 1$ vector. Then

(a) $\mathbf{x'Ax}$ = tr($\mathbf{x'Ax}$) = tr($\mathbf{Axx'}$)

(b) tr($\mathbf{A}$) = $\Sigma_{i=1}^k \lambda_i$, where the $\lambda_i$ are the eigenvalues of $\mathbf{A}$.

For Part a, we note that $\mathbf{x'Ax}$ is a scalar, so $\mathbf{x'Ax}$ = tr($\mathbf{x'Ax}$). We know that tr($\mathbf{BC}$) = tr($\mathbf{CB}$) for any two matrics $\mathbf{B}$ and $\mathbf{C}$ of dimensions $m \times k$ and $k \times m$, respectively. This follows because $\mathbf{BC}$ has $\Sigma_{j=1}^k b_{ij}c_{ji}$ as its ith diagonal element, so tr($\mathbf{BC}$) = $\Sigma_{i=1}^m ( \Sigma_{j=1}^k b_{ij}c_{ji})$. Similarly, the jth diagonal element of $\mathbf{CB}$ is $\Sigma_{i=1}^m c_{ji}b_{ij}$, so tr($\mathbf{CB}$) = $\Sigma_{j=1}^k ( \Sigma_{i=1}^m c_{ji}b_{ij})$ = $\Sigma_{i=1}^m ( \Sigma_{j=1}^k b_{ij}c_{ji})$ = tr($\mathbf{BC}$).

Let $\mathbf{x'}$ be the matrix $\mathbf{B}$ with m = 1, and let $\mathbf{Ax}$ play the role of the matrix $\mathbf{C}$. Then tr($\mathbf{x'(Ax)}$) = tr($\mathbf{(Ax)x'}$), and the result follows.

Part b is proved by using the spectral decomposition to write $\mathbf{A=P' \Lambda P}$, where $\mathbf{PP'=I}$ and $\mathbf{\Lambda}$ is a diagonal matrix with entries $\lambda_1$,$\lambda_2$,...,$\lambda_k$. Therefore, tr($\mathbf{A}$) = tr($\mathbf{P' \Lambda P}$) = tr($\mathbf{\Lambda P P'}$) = tr($\mathbf{\Lambda}$) = $\lambda_1 + \lambda_2 + \lambda_k$.

Yao Zhao
  • 211
  • 1
  • 5