12

(Note: I'm only interested in real-valued matrices here, so I'm using "transpose" and "symmetric" instead of the more general "transjugate" and "Hermitian" in the hope that it will simplify the proof. But the theorem apparently holds for complex-valued matrices as well.)

The Rayleigh quotient $R(M,v)$ of a symmetric matrix $M$ and a vector $v$ is defined as $\frac{v^T M v}{v^T v}$, where $x^T$ is the matrix transpose of $x$.

I've been told that the vector $v$ which gives the largest Rayleigh quotient is, in fact, the eigenvector corresponding to the largest eigenvalue of $M$. And furthermore, the value of the quotient in this case is equal to that eigenvalue. However, I've been unable to find a full proof of this fact, or an explanation of why it should work this way.

Why is there this connection between the Rayleigh quotient and the eigenvalues? Anything from an intuitive explanation to a formal proof would be appreciated.

Draconis
  • 1,253
  • 1
  • 7
  • 17

3 Answers3

14

First, note that R does not depend on the length of v, so we might as well impose the constraint $\mid v\mid^2=1$.

We maximize $R$ subject to this constraint by using a Lagrange multiplier: $v^tMv+\lambda (\mid v\mid^2-1)$, and differentiating with respect to the components of v, we obtain the equation $Mv+\lambda v=0$, so the extrema are precisely the eigenvectors of $M$. If $v$ is an eigenvector, then it follows immediately that the value of $R$ is the corresponding eigenvalue.

Mike Hawk
  • 5,222
  • 7
  • 19
  • 1
    Sorry, why does $R$ not depend on the length of $v$? – user614287 May 17 '19 at 16:05
  • 2
    @mathpadawan $R(M,v) = R(M, cv)$ for any $c \neq 0$. Plug it in and you will see this. – jodag Sep 18 '19 at 05:19
  • 1
    If I am not mistaken, this answers why any eigenvalue corresponds to an extrema, but does not tell why the maximum eigenvalue corresponds to the maximum of the Rayleigh coefficient. – Lukas Jan 24 '20 at 15:03
  • 3
    @Lukas, it is easy to see that the value of R at an extremum is the corresponding eigenvalue, so the largest extremum is the largest eigenvalue – Mike Hawk Apr 20 '20 at 16:25
9

The matrix $M$ describes a linear map $M:\>{\mathbb R}^n=V\to V$ of the euclidean vector space $V$ in terms of the standard basis of $V$. The Rayleigh quotient $$R(M,v):={\langle v, Mv\rangle \over |v|^2}$$ is defined independently of the chosen basis, and for orthonormal bases is given by the formula you quote. Now for a symmetric matrix $M$ there is an orthonormal basis that diagonalizes $M$. With respect to such a basis we have $$R(M,v)={\sum_{i=1}^n \lambda_i v_i^2\over\sum_{i=1}^nv_i^2}\ ,$$ and this is maximal when $v$ is an element of the eigenspace $E_\lambda$ corresponding to the eigenvalue $\lambda:=\max{\rm spec}(M)$.

Christian Blatter
  • 216,873
  • 13
  • 166
  • 425
  • Supposing $v$ is a unit vector, why is $\sum_{i=1}^{n} \lambda_i v_i^{2}$ maximal when $v$ is the eigenvector corresponding to the largest eigenvalue of $M$? – IntegrateThis Jun 28 '20 at 19:17
  • 3
    The RHS of my second equation is a weighted mean of the $\lambda_i$ with weights $v_i^2$ summing to $1$. This weighted mean is maximal when the weight of the maximal $\lambda_i$ is $1$, and all other $v_i=0$. This means $v$ is a basis vector, namely an eigenvector of $M$ belonging to the maximal $\lambda_i$. – Christian Blatter Jun 30 '20 at 14:20
7

I really enjoy the answers above, and they help me gain some geometrical intuition. I cannot comment under the answer, so I share them here. Hopefully, someone could help me to correct or modify my answer. Many thanks.

1) Pre-multiplying a symmetric matrix $M$ to a vector $v$ is the same as rotating and stretching this vector in the original space. The result should be $ku$, $u$ is a vector with some direction in the same space and $k$ is a scalar.

2) The eigenvector of $M$ gives a direction in which a vector will remain the same after pre-multiplying by $M$.

3) The Rayleigh quotient can be viewed as the cosine value of the angle between the original vector $v$ and $Mv$, multiplied by a scalar $k$.

Then the answer can be implied by those facts.

inavda
  • 794
  • 5
  • 16
ANuo
  • 91
  • 1
  • 5
  • 1
    Thank you very much for this answer. I have never thought about Rayleigh Quotient in this way. With this view, Maximum value of quotient will be with vector $v$ which maintains its direction after premultiplying by $M$, which is the eigenvector (for maximum, it will be the eigenvector corresponding to the largest eigenvalue). Its minimum value will be when it is perpendicular to the eigenvector corresponding to its smallest eigenvalue – artha Aug 18 '18 at 08:34
  • 1
    @artha many thanks for your editing and comments! I am not so confidence with my English, so not sure if we should add "direction" after "the same" in 2) such that other people can easily understand. – ANuo Aug 20 '18 at 07:23