187

I'm learning multivariate analysis and I have learnt linear algebra for two semester when I was a freshman.

Eigenvalue and eigenvector is easy to calculate and the concept is not difficult to understand.I found that there are many application of eigenvalue and eigenvector in multivariate analysis. For example

In principal components, proportion of total population variance due to kth principal component equal $$\frac{\lambda_k}{\lambda_1+\lambda_2+...\lambda_k}$$

I think eigenvalue product corresponding eigenvector has same effect as the matrix product eigenvector geometrically.

I think my former understanding may be too naive so that I cannot find the link between eigenvalue and its application in principal components and others.

I know how to induce almost every step form the assumption to the result mathematically. I'd like to know how to intuitively or geometrically understand eigenvalue and eigenvector in the context of multivariate analysis(in linear algebra is also good).

Thank you!

Martin Sleziak
  • 50,316
  • 18
  • 169
  • 342
Jill Clover
  • 4,597
  • 7
  • 29
  • 46
  • 4
    I guess you want intuition specifically for the application of eigenvalues and eigenvectors to principal components, in which case this previous question is closely related: [Why is the eigenvector of a covariance matrix equal to a principal component?](http://math.stackexchange.com/q/23596/856) –  Nov 24 '12 at 05:06
  • @RahulNarain Thank you! There are at least 5 application(factor analysis, canonical analysis....) of eigenvalues in multivariate analysis in my textbook. So.... – Jill Clover Nov 24 '12 at 05:14
  • 1
    You think that eigenvalues and eigenvectors are easy to understand. This is probably because you essentially deal with real symmetric matrices. For a general matrix, I don't think the two concepts are really that intuitive. – user1551 Nov 24 '12 at 10:14
  • @user1551 Could you share your ideas and answer the question? Thanks. – Jill Clover Nov 25 '12 at 04:33

7 Answers7

195

Personally, I feel that intuition isn't something which is easily explained. Intuition in mathematics is synonymous with experience and you gain intuition by working numerous examples. With my disclaimer out of the way, let me try to present a very informal way of looking at eigenvalues and eigenvectors.

First, let us forget about principal component analysis for a little bit and ask ourselves exactly what eigenvectors and eigenvalues are. A typical introduction to spectral theory presents eigenvectors as vectors which are fixed in direction under a given linear transformation. The scaling factor of these eigenvectors is then called the eigenvalue. Under such a definition, I imagine that many students regard this as a minor curiosity, convince themselves that it must be a useful concept and then move on. It is not immediately clear, at least to me, why this should serve as such a central subject in linear algebra.

Eigenpairs are a lot like the roots of a polynomial. It is difficult to describe why the concept of a root is useful, not because there are few applications but because there are too many. If you tell me all the roots of a polynomial, then mentally I have an image of how the polynomial must look. For example, all monic cubics with three real roots look more or less the same. So one of the most central facts about the roots of a polynomial is that they ground the polynomial. A root literally roots the polynomial, limiting it's shape.

Eigenvectors are much the same. If you have a line or plane which is invariant then there is only so much you can do to the surrounding space without breaking the limitations. So in a sense eigenvectors are not important because they themselves are fixed but rather they limit the behavior of the linear transformation. Each eigenvector is like a skewer which helps to hold the linear transformation into place.

Very (very, very) roughly then, the eigenvalues of a linear mapping is a measure of the distortion induced by the transformation and the eigenvectors tell you about how the distortion is oriented. It is precisely this rough picture which makes PCA very useful.

Suppose you have a set of data which is distributed as an ellipsoid oriented in $3$-space. If this ellipsoid was very flat in some direction, then in a sense we can recover much of the information that we want even if we ignore the thickness of the ellipse. This what PCA aims to do. The eigenvectors tell you about how the ellipse is oriented and the eigenvalues tell you where the ellipse is distorted (where it's flat). If you choose to ignore the "thickness" of the ellipse then you are effectively compressing the eigenvector in that direction; you are projecting the ellipsoid into the most optimal direction to look at. To quote wiki:

PCA can supply the user with a lower-dimensional picture, a "shadow" of this object when viewed from its (in some sense) most informative viewpoint

EuYu
  • 39,348
  • 8
  • 85
  • 129
  • 6
    "Eigenpairs are a lot like the roots of a polynomial." I love it! Spectral decomposition has same "idea" to factorization of polynomial function according to your idea. Does it right? – Jill Clover Nov 24 '12 at 05:49
  • 1
    Well, spectral decomposition is similar to polynomial factorization in the sense that they both determine the shape of the object. Think of them as structurally determining factorizations. – EuYu Nov 24 '12 at 05:56
  • 7
    +1. I dig the polynomial analogy and the skewer metaphor. – Potato Jun 08 '13 at 02:45
  • good description! "Eigenpairs are a lot like the roots of a polynomial", and the eigenvectors _ARE_ the roots of the characteristic polynomial of the Matrix. – philwalk Jun 03 '17 at 15:08
  • By saying "Very (very, very) roughly" do you mean that the statement is invalid but close to a real property or it's incomplete? – Kentzo Sep 27 '18 at 02:07
  • @Kentzo I mostly mean that the explanation is incomplete. I am trying to describe in words exactly what the mathematical picture of PCA is trying to convey, and that's necessarily an incomplete description. – EuYu Sep 27 '18 at 08:33
  • hi @EuYu I wanted to let you know that I found this answer quite useful. I asked a similar question when I was in undergrad and didn't get a satisfactory explanation. Thanks for taking the time to paint a picture here – Erin Sep 26 '19 at 01:46
  • Commenting here, as it's not worth a separate post. I found [this](https://www.youtube.com/watch?v=PFDu9oVAE-g&list=PLZHQObOWTQDPD3MizzM2xVFitgF8hE_ab&index=14) explanation from 3Blue1Brown very enlightening. Hope it helps those who are looking for the 'intuitions'. – Bitswazsky Jul 24 '21 at 06:38
  • @EuYu thank you for writing this it was very helpful ( skewers and distortion ). I'd love to read more informal treatments like this, do you write elsewhere or can recommend some readings ? – qbert65536 Mar 31 '22 at 21:28
107

First let us think what a square matrix does to a vector. Consider a matrix $A \in \mathbb{R}^{n \times n}$. Let us see what the matrix $A$ acting on a vector $x$ does to this vector. By action, we mean multiplication i.e. we get a new vector $y = Ax$.

The matrix acting on a vector $x$ does two things to the vector $x$.

  1. It scales the vector.
  2. It rotates the vector.

However, for any matrix $A$, there are some favored vectors/directions. When the matrix acts on these favored vectors, the action essentially results in just scaling the vector. There is no rotation. These favored vectors are precisely the eigenvectors and the amount by which each of these favored vectors stretches or compresses is the eigenvalue.

So why are these eigenvectors and eigenvalues important? Consider the eigenvector corresponding to the maximum (absolute) eigenvalue. If we take a vector along this eigenvector, then the action of the matrix is maximum. No other vector when acted by this matrix will get stretched as much as this eigenvector.

Hence, if a vector were to lie "close" to this eigen direction, then the "effect" of action by this matrix will be "large" i.e. the action by this matrix results in "large" response for this vector. The effect of the action by this matrix is high for large (absolute) eigenvalues and less for small (absolute) eigenvalues. Hence, the directions/vectors along which this action is high are called the principal directions or principal eigenvectors. The corresponding eigenvalues are called the principal values.

amWhy
  • 204,278
  • 154
  • 264
  • 488
  • 7
    As well as things mentioned by others, in this answer the sentence in bold is a nice intuition to keep hold of. – Dan Stowell Jul 25 '14 at 11:54
  • 3
    Thank you very much anonymous author for this great answer! – Vanni Oct 24 '14 at 12:35
  • 1
    Great answer!! Thank you! – ruoho ruotsi Apr 09 '15 at 05:56
  • 11
    The bold sentence is not true for non-symmetric matrices; see for example http://math.stackexchange.com/questions/1437569/do-eigenvectors-correspond-to-direction-of-maximum-scaling – Greg Martin Sep 16 '15 at 06:57
  • 1
    Another way to see that the sentence in bold is wrong for non-symmetric matrices, is to note that the sentence in bold describes a (right) singular vector of the matrix, and from standard SVD examples we know that right singular vectors of a real square matrix are not necessarily eigenvectors. https://qph.is.quoracdn.net/main-qimg-8c1e8dba50b7b9a2c03d93f98e3f2d75?convert_to_webp=true – littleO Jul 05 '16 at 23:25
  • 1
    A bit misleading, as Greg pointed out, this is NOT true for non-symmetric matrices which is most matrices – Rahul Deora Sep 05 '19 at 13:10
12

An eigenvector is the axis on which the matrix operation hinges, within the paradigm of a specific operation. The eigenvalue is how important it is, again within the paradigm of the specific operation, and relative to the eigenvalues of other eigenvectors. This is clear in the example in the wikipedia history section-

Euler studied the rotational motion of a rigid body and discovered the importance of the principal axes. Lagrange realized that the principal axes are the eigenvectors of the inertia matrix. [1]

That is obviously a very limited example. Eigenvectors are pretty ridiculously useful when you realize that scalars might be complex numbers or any kind of number, vectors might be functions or frequencies, and instead of matrix multiplication the transformation can be an operator like the derivative from calculus. [simple english wikipedia.]

When you use eigenvector and eigenvalue analysis on a different sort of matrix, like the adjacency matrix for a directed acyclic graph representing links between websites, you can come up with a large number of eigenvectors each with different eigenvalues varying in size, and the largest one (known as the primary eigenvector) can be used as a proxy for the 'best option.' That's how google pagerank worked originally [2]. But you could intuitively grasp the eigenvectors as an analysis of the extent to which the adjacency matrix (and the network it represents) hinges on each website for a given operation, and their eigenvalues demonstrate the magnitude of the 'hinging'. Pagerank told you with the operation incorporating the keywords.

  1. https://en.wikipedia.org/wiki/Eigenvalues_and_eigenvectors#History
  2. https://www.scottaaronson.com/blog/?p=1820
Martin Sleziak
  • 50,316
  • 18
  • 169
  • 342
Peter Scheyer
  • 121
  • 1
  • 2
10

IMO, to understand eigenvalues $\lambda_i$ and eigenvectors $\textbf{V}$, it is important to understand what the matrix $\textbf{A}$ in a set of equations $\textbf{Ax}=\textbf{b}$ does. Matrix $\textbf{A}$ simply "transforms" a vector $\textbf{x}$ into another vector $\textbf{b}$ by applying linear combination. The transformation is done within the same space or subspace.

Sometimes we only want to know what would be the vector $\textbf{b}$ if linear combination is applied, that is when we execute the equation $\textbf{Ax}=\textbf{b}$. Other times we are interested in a reverse problem and we want to solve the equation $\textbf{x}=\textbf{A}^{-1}\textbf{b}$.

In some situations, we are however interested in transformation itself. Then, Eigenvectors $\textbf{V}$ and corresponding eignevalues $\lambda_i$ simply let us examine such a transformation; that is, how exactly matrix $\textbf{A}$ works/transforms vectors. Regardless of any physical meaning, Eigenvectors are the directions along which linear transformation occurs only by scaling, whereas eigenvalues $\lambda_i$ are the scales along those directions. For symmetric matrices, Eigenvectors are orthogonal to one another. That's it. Once we know those, we can determine how matrix $\textbf{A}$ transforms vectors.

The PCA is one of the applications of eigenvectors and eigenvalues. In PCA, the problem is related to data and variance accounted for in all components. In original data set, variance is spread along all components and each original component shares some percentage of variance with the others. When PCA is calculated from a covariance matrix, each eigenvalue $\lambda_i$ represents variance along the corresponding Principal Component. Here, PC's directions or axes are determined by eigenvectors $\textbf{V}$. Eigenvectors of cov. matrix, and therefore PC's, are orthogonal to one another. Calculating $\lambda$'s and $\textbf{V}$ gives us the answer.

Celdor
  • 581
  • 4
  • 14
  • 3
    A comment about some of the things you've said. It's not exactly true that "eigenvectors are the directions along which the transformations occur". The matrix transforms _every_ vector and hence every direction. What is true is that eigenvectors are the directions in which the transformation is invariant. Also, eigenvectors are not all orthogonal to each other. In fact, that's not true unless the matrix is normal. – EuYu Jun 08 '13 at 02:07
  • 1
    Hey EuYu. Thanks for pointing out the mistakes. You are absolutely right about directions and orthogonality of eigenvectors. – Celdor Jun 08 '13 at 03:38
  • Could you possibly expand this explanation the the generalized eigenvalue problem? $A$$x$ = $\lambda$$B$$x$ ? – Ethan Jan 28 '16 at 23:33
7

I recommend you strongly to visit this page. It perfectly visualize the concept of eigenvalues and eigenvectors.

Martin Sleziak
  • 50,316
  • 18
  • 169
  • 342
M a m a D
  • 957
  • 9
  • 27
5

Given a linear operator $T:V \to V$, it's natural to try to find a basis $\beta$ of $V$ so that $[T]_{\beta}$, the matrix of $T$ with respect to $\beta$, is as simple as possible. Ideally, $[T]_{\beta}$ would be diagonal. And it's easy to see that if $[T]_{\beta}$ is diagonal, then $\beta$ is a basis of eigenvectors of $T$. This is one way that we might discover the idea of eigenvectors, and recognize their significance.

Here are some details. Suppose that $\beta = (v_1,\ldots,v_n)$ is an ordered basis for $V$, and $[T]_{\beta}$ is diagonal: \begin{equation} [T]_{\beta} = \begin{bmatrix} \lambda_1 & & \\ & \ddots & \\ & & \lambda_n \end{bmatrix}. \end{equation} The first column of $[T]_{\beta}$ is $[T(v_1)]_{\beta}$, the coordinate vector of $T(v_1)$ with respect to $\beta$. This shows that \begin{equation} [T(v_1)]_{\beta} = \begin{bmatrix} \lambda_1 \\ 0 \\ \vdots \\ 0 \end{bmatrix}. \end{equation} And this means that \begin{align} T(v_1) &= \lambda_1 v_1 + 0 \cdot v_2 + \cdots + 0 \cdot v_n \\ &= \lambda_1 v_1. \end{align} We see that $T(v_1)$ is just a scalar multiple of $v_1$. In other words, $v_1$ is an eigenvector of $T$, with eigenvalue $\lambda_1$.

Similarly, $v_2,\ldots,v_n$ are also eigenvectors of $T$.

littleO
  • 48,104
  • 8
  • 84
  • 154
1

The determinant tells you by how much the linear transformation associated with a matrix scales up or down the area/volume of shapes. Eigenvalues and eigenvectors provide you with another useful piece of information. They tell you by how much the linear transformation scales up or down the sides of certain parallelograms. This also makes clear why the determinant of a matrix is equal to the product of its eigenvalues: e.g., in two-dimensional space, if the linear transformation doubles the length of a couple of parallel sides of a parallelogram (one eigenvalue is equal to 2) and triples the length of the other couple of sides (the other eigenvalue is 3), then the area of the parallelogram is multiplied by $2 \times 3 = 6$. See https://www.statlect.com/matrix-algebra/eigenvalues-and-eigenvectors for more details.

Chris Tang
  • 365
  • 2
  • 13
user4422
  • 227
  • 1
  • 13