230

Can someone point me to a paper, or show here, why symmetric matrices have orthogonal eigenvectors? In particular, I'd like to see proof that for a symmetric matrix $A$ there exists decomposition $A = Q\Lambda Q^{-1} = Q\Lambda Q^{T}$ where $\Lambda$ is diagonal.

Rodrigo de Azevedo
  • 18,977
  • 5
  • 36
  • 95
Phonon
  • 3,608
  • 5
  • 23
  • 27
  • Are you missing $A$ somewhere in that equation? – Arturo Magidin Nov 15 '11 at 21:12
  • 6
    If $A$ is symmetric, we have $AA^* = A^2 = A^*A$ so $A$ is normal. The assertion then follows directly from the spectral theorem. So just go read any proof of the spectral theorem, there are many copies available online. –  Nov 15 '11 at 21:19
  • 63
    The statement is imprecise: eigenvectors corresponding to *distinct eigenvalues* of a symmetric matrix must be orthogonal to each other. Eigenvectors corresponding to the same eigenvalue need not be orthogonal to each other. However, since every subspace has an orthonormal basis, you can find orthonormal bases for each eigenspace, so you can find an orthonormal basis of eigenvectors. – Arturo Magidin Nov 15 '11 at 21:19
  • @ArturoMagidin That was a part of my confusion: I knew it had to be true for distinct eigenvalues, but I couldn't show it to be true otherwise, but how do we find $Q\Lambda Q^T = A$? – Phonon Nov 15 '11 at 21:27
  • 3
    @Phonon: It's false otherwise, but you can *find* a basis for the eigenspace made up of orthogonal eigenvectors: just take *any* basis for the eigenspace, and apply Gram-Schmid. Once you have a basis of eigenvectors for all of $\mathbb{R}^n$, $Q$ is the matrix whose columns are the elements of the basis. – Arturo Magidin Nov 15 '11 at 21:32
  • 2
    @Phonon: Might I add: if you already knew it was true for distinct eigenvalues, why not say so in your question? It would have saved me the trouble of writing it out, and then it would have been clear what your doubt was: you could have gotten a response that didn't re-tread stuff you already knew. – Arturo Magidin Nov 15 '11 at 21:40
  • @ArturoMagidin, Sorry, I'm probably missing something silly and obvious, but if we have matrix $Q$ and apply Gram-Schmidt to it to get $\tilde{Q}$, how do we know that $Q\Lambda Q^{-1} = \tilde{Q} \tilde{\Lambda} \tilde{Q}^{-1}$ where $\tilde{\Lambda}$ is still diagonal? – Phonon Nov 15 '11 at 21:43
  • @ArturoMagidin Yes, you're right. I think I can try and find out the rest on my own. Thanks. – Phonon Nov 15 '11 at 21:54
  • 2
    @Phonom. Two different ways: first, you can *not* compute $Q$ until *after* you have an orthonormal basis of eigenvectors. Second way: Work the *other* way: You have $\Lambda = Q^{-1}AQ$. Now orthonormalize the *columns* of $Q$ by multiplying on the right by elementary matrices, and adjust the inverse by multiplying by the inverse of the elementary matrices. So at each step you get $E^{-1}\Lambda E = E^{-1}Q^{-1}AQE$. But $E^{-1}\Lambda E$ is diagonal, so you get $\Lambda' = Q'AQ'^{-1}$. Lather, rinse, repeat until $Q^{-1}$ has orthonormal columns. – Arturo Magidin Nov 15 '11 at 21:55
  • http://math.stackexchange.com/questions/393149/geometric-multiplicity-algebraic-multiplicity-for-a-symmetric-matrix/393165#393165 – DVD Sep 24 '15 at 04:16
  • What do you call A^*? Math conjugate? – user4951 Jun 06 '16 at 14:43

6 Answers6

298

For any real matrix $A$ and any vectors $\mathbf{x}$ and $\mathbf{y}$, we have $$\langle A\mathbf{x},\mathbf{y}\rangle = \langle\mathbf{x},A^T\mathbf{y}\rangle.$$ Now assume that $A$ is symmetric, and $\mathbf{x}$ and $\mathbf{y}$ are eigenvectors of $A$ corresponding to distinct eigenvalues $\lambda$ and $\mu$. Then $$\lambda\langle\mathbf{x},\mathbf{y}\rangle = \langle\lambda\mathbf{x},\mathbf{y}\rangle = \langle A\mathbf{x},\mathbf{y}\rangle = \langle\mathbf{x},A^T\mathbf{y}\rangle = \langle\mathbf{x},A\mathbf{y}\rangle = \langle\mathbf{x},\mu\mathbf{y}\rangle = \mu\langle\mathbf{x},\mathbf{y}\rangle.$$ Therefore, $(\lambda-\mu)\langle\mathbf{x},\mathbf{y}\rangle = 0$. Since $\lambda-\mu\neq 0$, then $\langle\mathbf{x},\mathbf{y}\rangle = 0$, i.e., $\mathbf{x}\perp\mathbf{y}$.

Now find an orthonormal basis for each eigenspace; since the eigenspaces are mutually orthogonal, these vectors together give an orthonormal subset of $\mathbb{R}^n$. Finally, since symmetric matrices are diagonalizable, this set will be a basis (just count dimensions). The result you want now follows.

Arturo Magidin
  • 356,881
  • 50
  • 750
  • 1,081
  • 14
    This answer, though intuitively satisfying, assumes that $A$ has the maximum number of eigenvectors, i. e. no generalized eigenvectors. For a more general proof see my answer. – level1807 Oct 06 '14 at 07:47
  • 12
    @level1807 It doesn't "assume" anything: **any** square symmetric matrix is diagonalizable $\; \iff\;$ there is a basis of eigenvectors of the matrix for the space we're working on, and *thus* there is no need of worrying about generalized eigenvectors. –  Jul 12 '15 at 13:38
  • 1
    @AntoineNemesioParras OP asked for a proof of the diagonality of the Jordan form of $A$, too. – level1807 Jul 12 '15 at 16:37
  • 24
    Notation question: $\langle\mathbf{a}, \mathbf{b}\rangle = \mathbf{a} \cdot \mathbf{b}$? – Gordon Gustafson Nov 22 '15 at 18:53
  • 3
    @GordonGustafson Yes. – F.Webber Jan 16 '16 at 13:33
  • 8
    What if two eigenvalues are the same? –  Dec 07 '16 at 08:07
  • @Herry If you consider two arbitrary (but different) eigenvectors for the same eigenvalue, of course they will not generally be orthogonal. – anonymous Dec 12 '16 at 19:16
  • 2
    How can people give up upvotes so easily, without even noticing that this proof uses two statements as self-evident!? First he assumed symmetric matrices are diagonalizable, secondly and most importantly he supposed we have at least two distinct eigenvalues, which is only a case! – Ashkan Ranjbar Apr 07 '17 at 02:04
  • 7
    @AshkanRanjbar: You need to read more carefully; yes, the answer assumes symmetric matrices are diagonalizable, as this is a commonly proven result in basic linear algebra courses. But your second objection is incorrect. Read carefully: what it shows is that eigenvectors corresponding to distinct eigenvalues are automatically orthogonal. The final paragraph says how to proceed: find an orthonormal basis for each eigenspace. When you put them together, the result about distinct eigenvalues shows the result is still orthonormal. If there is only one eigenspace, its orthonormal basis suffices. – Arturo Magidin Apr 07 '17 at 03:58
  • Nah, I'm not concerned with 'distinct', but eigenvectors of a 'particular' eigenvalue. Then again, by your standards, even symmetric matrices having "the sum of geometric multiplicities of eigenvalues meets the same number as the rank of the matrix(i.e. the size of the matrix)" is considered to be 'basic'! – Ashkan Ranjbar Apr 07 '17 at 22:45
  • 3
    @AshkanRanjbar I honestly don't understand your original second objection, or what it is you are arguing about. First, it's not "my standards". What I said was that the diagonalizability of symmetric matrices is often covered in "basic linear algebra courses". Not that the result itself is "basic" (I wouldn't call the Fundamental Theorem of Calculus "basic", but it *is* covered in a basic calculus course). Those courses do not usually cover inner products, though so your second comment and its snark about "my standards" is a non sequitur. (cont) – Arturo Magidin Apr 08 '17 at 03:34
  • 5
    Second: your original claim was that there was an error for assuming "we have at least two distinct eigenvalues, which is only a case!" And this claim is incorrect. There is no such assumption, nor is it needed. The final paragraph applies to the case of a single eigenvalue. For finding an orthonormal basis for the eigenspaces, all you need is Gram Schmidt and a basis. Your assertion that I did not prove that the dimension of the eigenspace equals the multiplicity of the eigenvalue is correct, I did not, nor did I think I had to (5 years ago). Your other complaints, though, are incorrect. – Arturo Magidin Apr 08 '17 at 03:38
  • Fine, call it "non-sequitur preference", but I still don't accept your proof, and your comments justifying your lack of details. – Ashkan Ranjbar Apr 09 '17 at 18:44
  • 27
    @AshkanRanjbar Nobody called anything "non-sequitur preference". You *really* need to work on your reading comprehension. Nobody asked you to accept anything, especially five years after the fact. Nor did I expect you to acknowledge your error of claiming the argument assumed there were more than one eigenvalue. So, frankly, my dear, I couldn't care less what you accept or don't accept. I don't need *your* approval. Once you get to 125 reputation, you are welcome to come back and downvote it and get 2 points off my 229,000+ reputation. It'll be the second downvote on this one. – Arturo Magidin Apr 09 '17 at 19:06
  • 3
    $\langle a,b\rangle =b^*a$ (the inner product). It equals $b^T a$ if $b$ is real and $a^T b$ if both are real. The order of the dot product is important, and although here $\langle \lambda a,b\rangle=b^*(\lambda a) = \lambda b^*a=\lambda \langle a,b\rangle$, i.e., the product is linear with respect to the first element (conjugate-linear w.r.t. to the second: $(\lambda b)^*a=\bar\lambda b^*a$), many if not most authors have it linear w.r.t. the second element only. That allows, in functional analysis, to have $\langle F,b\rangle=F(b)$ linear w.r.t. $b$ ($b\in X$ (a Banach space), $F\in X^*$). – user3810316 Feb 26 '20 at 10:31
  • 2
    Thus, $\langle Ax,y\rangle = y^*Ax = y^* A^T x=y^* A^* x=(Ay)^*x=\langle x,Ay\rangle$, because $A=A^T$ (when $A$ is symmetric), and $A$ being real implies that $A^T=A^*$. Generally $A^*:=A^H:=\overline{A^T}$, and $(AB)^*=B^*A^*$. – user3810316 Feb 26 '20 at 10:37
  • 1
    @user3810316: Sorry, but what is your point? As to your claim on linearity on which coordinate, it’s not “most authors” that have it linear in the second; it’s more of a physics/math divide. All of *my* books, written by mathematicians, have it linear in the first coordinate, not the second. But mostly, I am utterly confused about why you add a comment here. If you want to post an answer, post an answer. – Arturo Magidin Feb 26 '20 at 14:34
  • 2
    Sorry, it was for Gordon Gustafson (see above) and anybody else who wonders the notation. I guess that $\langle a,b\rangle$ is more common in the complex case (where $a\cdot b$ is unclear?). You might be right of the phys/math divide; I don't know. It might relate to the $\langle \Lambda,x\rangle$ vs. $\langle x,\Lambda\rangle$ divide, although one often defines that linear w.r.t. both. That makes the isomorphism $H\to H^*$ conjugate-linear, if you identify $y$ with $(y,\cdot)$ (or with $(\cdot,y)$ if your second argument is the linear one). In Google, $\langle \Lambda,x\rangle$ seems the norm – user3810316 Feb 27 '20 at 14:48
44

Since being symmetric is the property of an operator, not just its associated matrix, let me use $\mathcal{A}$ for the linear operator whose associated matrix in the standard basis is $A$. Arturo and Will proved that a real symmetric operator $\mathcal{A}$ has real eigenvalues (thus real eigenvectors) and that eigenvectors corresponding to different eigenvalues are orthogonal. One question still stands: how do we know that there are no generalized eigenvectors of rank more than 1, that is, all Jordan blocks are one-dimensional? Indeed, by referencing the theorem that any symmetric matrix is diagonalizable, Arturo effectively thew the baby out with the bathwater: showing that a matrix is diagonalizable is tautologically equivalent to showing that it has a full set of eigenvectors. Assuming this as a given dismisses half of the question: we were asked to show that $\Lambda$ is diagonal, and not just a generic Jordan form. Here I will untangle this bit of circular logic.

We prove by induction in the number of eigenvectors, namely it turns out that finding an eigenvector (and at least one exists for any matrix) of a symmetric matrix always allows us to generate another eigenvector. So we will run out of dimensions before we run out of eigenvectors, making the matrix diagonalizable.

Suppose $\lambda_1$ is an eigenvalue of $A$ and there exists at least one eigenvector $\boldsymbol{v}_1$ such that $A\boldsymbol{v}_1=\lambda_1 \boldsymbol{v}_1$. Choose an orthonormal basis $\boldsymbol{e}_i$ so that $\boldsymbol{e}_1=\boldsymbol{v}_1$. The change of basis is represented by an orthogonal matrix $V$. In this new basis the matrix associated with $\mathcal{A}$ is $$A_1=V^TAV.$$ It is easy to check that $\left(A_1\right)_{11}=\lambda_1$ and all the rest of the numbers $\left(A_1\right)_{1i}$ and $\left(A_1\right)_{i1}$ are zero. In other words, $A_1$ looks like this: $$\left( \begin{array}{c|ccc} \lambda_1 & \\ \hline & & \\ & & B_1 & \\ & & \end{array} \right)$$ Thus the operator $\mathcal{A}$ breaks down into a direct sum of two operators: $\lambda_1$ in the subspace $\mathcal{L}\left(\boldsymbol{v}_1\right)$ ($\mathcal{L}$ stands for linear span) and a symmetric operator $\mathcal{A}_1=\mathcal{A}\mid_{\mathcal{L}\left(\boldsymbol{v}_1\right)^{\bot}}$ whose associated $(n-1)\times (n-1)$ matrix is $B_1=\left(A_1\right)_{i > 1,j > 1}$. $B_1$ is symmetric thus it has an eigenvector $\boldsymbol{v}_2$ which has to be orthogonal to $\boldsymbol{v}_1$ and the same procedure applies: change the basis again so that $\boldsymbol{e}_1=\boldsymbol{v}_1$ and $\boldsymbol{e}_2=\boldsymbol{v}_2$ and consider $\mathcal{A}_2=\mathcal{A}\mid_{\mathcal{L}\left(\boldsymbol{v}_1,\boldsymbol{v}_2\right)^{\bot}}$, etc. After $n$ steps we will get a diagonal matrix $A_n$.

There is a slightly more elegant proof that does not involve the associated matrices: let $\boldsymbol{v}_1$ be an eigenvector of $\mathcal{A}$ and $\boldsymbol{v}$ be any vector such that $\boldsymbol{v}_1\bot \boldsymbol{v}$. Then $$\left(\mathcal{A}\boldsymbol{v},\boldsymbol{v}_1\right)=\left(\boldsymbol{v},\mathcal{A}\boldsymbol{v}_1\right)=\lambda_1\left(\boldsymbol{v},\boldsymbol{v}_1\right)=0.$$ This means that the restriction $\mathcal{A}_1=\mathcal{A}\mid_{\mathcal{L}\left(\boldsymbol{v}_1\right)^{\bot}}$ is an operator of rank $n-1$ which maps ${\mathcal{L}\left(\boldsymbol{v}_1\right)^{\bot}}$ into itself. $\mathcal{A}_1$ is symmetric for obvious reasons and thus has an eigenvector $\boldsymbol{v}_2$ which will be orthogonal to $\boldsymbol{v}_1$.

level1807
  • 1,119
  • 9
  • 18
30

It would appear that you want to write vectors as rows, so your preferred multiplication will be on the left side, as in $v \mapsto v A.$

The ordinary dot product is then $ v \cdot w = v w^T = w v^T = w \cdot v.$ Note that $v w^T$ is a number, or a 1 by 1 matrix, and is equal to its transpose.

In the same way, $v A \cdot w = v A w^T.$ However, $v A w^T$ is again a 1 by 1 matrix and is equal to its transpose, and $A^T = A,$ so we get $$ v A \cdot w = v A w^T = (v A w^T)^T = (w^T)^T A^T v^T = w A v^T = w A \cdot v$$

First suppose $v,w$ are eigenvectors with distinct eigenvalues $\lambda, \mu.$ We have $$ v A \cdot w = \lambda v \cdot w = w A \cdot v = \mu w \cdot v.$$ Or, $\lambda v \cdot w = \mu v \cdot w,$ finally $$ (\lambda - \mu) v \cdot w = 0.$$ So, eigenvectors with distinct eigenvalues are orthogonal.

It is possible that an eigenvalue may have larger multiplicity. However, for a fixed eigenvalue $\lambda,$ the set of vectors $v$ for which $ v A = \lambda v$ is a subspace, of full dimension (meaning the Jacobi form has no off-diagonal elements), and we may simply choose an orthonormal basis for this subspace. Choosing, in this way, all basis vectors to be length 1 and orthogonal, we get an orthonormal basis of eigenvalues of $A.$ Write those as rows of a matrix $P,$ we get $P A P^T = \Lambda.$

The only difficult aspect here is this: if an eigenvalue has algebraic multiplicity larger than one, that is the characteristic polynmial has a factor of $(x-\lambda)^k$ for some $k \geq 2,$ how can I be sure that the geometric multiplicity is also $k?$ That is, with $A$ symmetric, how do I know that $$ v (A - \lambda I)^k = 0 \; \; \Rightarrow \; \; v (A - \lambda I) = 0?$$ Working on it. It appears that this is, at heart, induction on $k,$ and takes many pages. Give me some time.

Alright, this works. An induction on dimension shows that every matrix is orthogonal similar to an upper triangular matrix, with the eigenvalues on the diagonal (the precise statement is unitary similar). How do we know the eigenvalues are real? We have an eigenvalue $\lambda$ with an eigenvector $v,$ perhaps both with complex entries. As is traditional, for a vector or matrix define $v^\ast = \bar{v}^T$ and $A^\ast = \bar{A}^T.$ It is easy to see that $v v^\ast$ is a positive real number unless $v = 0.$ In any case $A^\ast = A.$ So, given $v A = \lambda v,$ $$ ( v A v^\ast)^\ast = (v^\ast)^\ast A^\ast v^\ast = v A v^\ast.$$ As a result, the complex number $v A v^\ast$ is actually a real number. At the same time, $v A v^\ast = \lambda v v^\ast,$ and since both $v A v^\ast$ and $v v^\ast$ are real numbers, the latter nonzero, it follows that $\lambda$ is real.

Put these together, we get that each real matrix with real characteristic values is orthogonal similar to an upper triangular real matrix. However, as $A$ is symmetric, this upper triangular matrix is actually diagonal.

Will Jagy
  • 130,445
  • 7
  • 132
  • 248
13

Let's assume that $x$ is an eigenvector of $A$ corresponding to the eigenvalue $\lambda_1$ and $y$ an eigenvector of $A$ corresponding to the eigenvalue $\lambda_2$, with $\lambda_1 \neq \lambda_2$.

$Ax=\lambda_1x \\ Ay=\lambda_2y$

After taking into account the fact that A is symmetric ($A=A^*$):

$y^{\intercal}Ax=\lambda_1y^{\intercal}x \\ x^{\intercal}A^{\intercal}y=\lambda_2x^{\intercal}y$

Now subtract the second equation from the first one and use the commutativity of the scalar product:

$y^{\intercal}Ax-x^{\intercal}A^{\intercal}y=\lambda_1y^{\intercal}x - \lambda_2x^{\intercal}y \\ 0 = (\lambda_1 - \lambda_2)y^{\intercal}x$

Hence $x$ and $y$ are orthogonal.

BGa
  • 141
  • 1
  • 3
3

How about Let $A$ be symmetric, then there exists a matrix $D$ such that $A=QDQ^T$, taking the transpose of $A$, namely

$$\left(A\right)^T = \left(QDQ^T\right)^T $$ $$A^T = \left(Q^T\right)^TD^TQ^T$$ $$A^T = QDQ^T$$

thus $A^T = A$ if and only if $A$ is symmetric.

It is noteworthy that $D^T = D$ since $D$ is diagonal and $Q$ is the matrix of normed eigenvectors of $A$, Thus $Q^T = Q^{-1}$

user127.0.0.1
  • 6,957
  • 6
  • 28
  • 44
Aiden Strydom
  • 189
  • 2
  • 6
  • 8
    This solves the wrong direction of the problem. – Phonon May 20 '14 at 00:05
  • 2
    $A^t = A$ is related to eigenvectors how? – Don Larynx Mar 04 '15 at 22:07
  • 1
    When you start with $A=A^T$ and the eigendecomposition is written as $A=QDQ^{-1}$, then the transpose of this yields $A^T=\left(Q^{-1}\right)^TDQ^T$, but has to be equal to the initial decomposition, which will only be the case if $Q^{-1}=Q^T$ which is the definition of an orthogonal matrix. – Kwin van der Veen Jan 27 '16 at 01:45
  • Thank you. Let $A$ be symmetric, then there exists a matrix $D$ such that $A=QDQ^T$ is theorem? Why you assumed that? – Avv Feb 24 '21 at 01:55
  • Your assumption that $A = QDQ^{\top}$, $D$ diagonal, is precisely what's being asked of you to prove. – Joe Shmo Aug 14 '21 at 13:41
2

Note a real symmetric matrix is a linear operator on Euclidean space with respect standard basis (orthonormal). So the fact that it equals to its conjugate transpose implies it is self-adjoint.

For two distinct eigenvalues $\lambda_1, \lambda_2$ and corresponding eigenvectors $v_2, v_2$, $$(\lambda_1-\lambda_2)<v_1, v_2>=<Tv_1,v_2>-<v_1,\bar{\lambda_2} v_2>=<Tv_1,v_2>-<v_1,T^* v_2>=0$$ where the 2nd last equality follows from properties of self-adjoint (thus normal) linear operator (Lemma below).

Lemma: Assume $T$ is normal. If $(\lambda, v)$ is eigenvalue and eigenvector of $T$, $(\bar{\lambda}, v)$ is eigenvalue and eigenvector of the adjoint $T^*$. (pf.) Trivial from definition of normality.

Daniel Li
  • 2,821
  • 11
  • 24