I mean, you can say it's similar to a diagonal matrix, it has $n$ independent eigenvectors, etc., but what's the big deal of having diagonalizability? Can I solidly perceive the differences between two linear transformation one of which is diagonalizable and the other is not, either by visualization or figurative description?

For example, invertibility can be perceived. Because non-invertible transformation must compress the space in one or more certain direction to $0$. Like crashing a space flat.

Rodrigo de Azevedo
  • 18,977
  • 5
  • 36
  • 95
  • 3,933
  • 1
  • 26
  • 50
  • 2
    If a matrix is diagonizable, you have a succinct way to describe the way the matrix acts on the space. For instance, let $\{v_1, \dots, v_n\}$ be an eigenvector basis for your space, then $Av = c_1 \lambda_1 v_1 + \dots + c_n \lambda_n v_n$, where $v=c_1 v_1+\dots +c_n v_n$. This is by no means the only reason why we care about diagonalizability though. – Fernando Martin Sep 10 '12 at 05:13
  • @FernandoMartin: Yeah, I knew that though. You can describe a diagonalizable transformation using merely expansion/shrinking in certain directions. – xzhu Sep 10 '12 at 05:18
  • 3
    This doesn't treat perception, so it's just a comment: It's pretty easy to take powers of a diagonalizable matrix $A$, for if $A = BDB^{-1}$ with $D$ diagonal then $A^n = BD^nB^{-1}$, and computing $D^n$ is just computing powers of the diagonal element. I think in numerical methods (a subject about which I am complete ignorant, which did not stop me from grading for it…), where you want to approximate a solution of a differential equation via some fixed point/contraction business, this is a big win, computationally. – Dylan Moreland Sep 10 '12 at 06:42

5 Answers5


Up to change in basis, there are only 2 things a matrix can do.

  1. It can act like a scaling operator where it takes certain key vectors (eigenvectors) and scales them, or
  2. it can act as a shift operator where it takes a first vector, sends it to a second vector, the second vector to a third vector, and so forth, then sends the last vector in a group to zero.

It may be that for some collection of vectors it does scaling whereas for others it does shifting, or it can also do linear combinations of these actions (block scaling and shifting simultaneously). For example, the matrix $$ P \begin{bmatrix} 4 & & & & \\ & 3 & 1 & & \\ & & 3 & 1 &\\ & & & 3 & \\ & & & & 2 \end{bmatrix} P^{-1} = P\left( \begin{bmatrix} 4 & & & & \\ & 3 & & & \\ & & 3 & &\\ & & & 3 & \\ & & & & 2 \end{bmatrix} + \begin{bmatrix} 0& & & & \\ & 0& 1 & & \\ & & 0& 1 &\\ & & & 0& \\ & & & &0 \end{bmatrix}\right)P^{-1} $$ acts as the combination of a scaling operator on all the columns of $P$

  • $p_1 \rightarrow 4 p_1$, $p_2 \rightarrow 3 p_2$, ..., $p_5 \rightarrow 2 p_5$,

    plus a shifting operator on the 2nd, 3rd and 4th columns of $P$:

  • $p_4 \rightarrow p_3 \rightarrow p_2 \rightarrow 0$.

This idea is the main content behind the Jordan normal form.

Being diagonalizable means that it does not do any of the shifting, and only does scaling.

For a more thorough explanation, see this excellent blog post by Terry Tao: http://terrytao.wordpress.com/2007/10/12/the-jordan-normal-form-and-the-euclidean-algorithm/

Nick Alger
  • 16,798
  • 11
  • 59
  • 85
  • Note that this only works over algebraically closed fields such as $\mathbb C$. For example, rotation in the plane $\mathbb R^2$ by 90 degrees does no scaling or shifting in the sense of this answer. – Alex Becker Sep 10 '12 at 17:55
  • Sure, but matrices acting on $\mathbb{R}^2$ can always be reinterpreted as acting on the larger space $\mathbb{C}^2$, and this is what is normally done both theoretically and in engineering practice. If this is done, 2D rotations for example, have eigenvectors $(1,i)$ and $(-i,1)$. – Nick Alger Sep 10 '12 at 19:08
  • True, and that is what I would do, but it is still not doing either of the two things you list *on $\mathbb R^2$*, which is where one would ideally want to visualize a $2\times 2$ real matrix acting. – Alex Becker Sep 10 '12 at 19:11
  • 3
    Applaud! This is the exact answer I want. Yesterday I worked on this for hours and finally conclude that it's the shifting, (or as I call it, "shearing",) that plays the critical role in diagonalizability. And I was shocked to find in my textbook that this is just the idea behind a famous theory - the "Jordan canonical form theory". I'm so excited now. And, thank you! – xzhu Sep 10 '12 at 23:19
  • Also perhaps of note, these ideas are highly related to the intuition behind modern Krylov methods for solving linear systems (eg, conjugate gradient, GMRES, etc). See for example, [The Idea Behind Krylov Methods](http://meyer.math.ncsu.edu/Meyer/PS_Files/Krylov.ps) – Nick Alger Sep 15 '12 at 01:38
  • @NickAlger could you explain a little bit more how the scaling/shift interpretation is related to the Krylov methods? I read the linked paper, but I didn't the relation. – chaohuang Feb 07 '13 at 23:43

If an $n\times n$ real matrix $M$ is diagonalizeable, then it corresponds to stretching $\mathbb R^n$ along $n$ linearly independent lines some factor $\lambda_n$. Otherwise this is not the case.

Alex Becker
  • 58,347
  • 7
  • 124
  • 180

I'll try an answer in a different (equivalent) direction: what happens when the matrix is not diagonalizable?

First of all, this must mean that some of the matrix's eigenvalues occur more than once. Otherwise the matrix really can't do anything else than simply stretching its eigenvectors by $\lambda_n$. So what if two eigenvalues are equal? Let's write down the most non-trivial non-diagonalizable example there is $$\pmatrix{0 & 1 \cr 0 & 0}.$$ This (as you have correctly observed) crushes the space. But the important point is that it doesn't crush it to zero! Instead it only crushes it to some subspace. In general, if you have a nilpotent matrix (all eigenvalues vanish) there are many subspaces (of varying dimensions) to pick from and so many different ways to crush the space. The zero operator sends everything to zero immediately while it will take a nontrivial nilpotent matrix some (finite) time in a sense that for $T$ nilpotent, $T^n = 0$ for some $n$. In general, every nilpotent matrix is similar to a matrix that looks something like this $$\pmatrix{0 & 0 & 1 & 0 \cr 0 & 0 &1 &1 \cr 0 & 0 & 0 & 1 \cr 0 & 0 & 0 &0}$$ with some ones above the diagonal. The precise position of those ones (if there are any) determines which subspace is being crush to which and so on. You are very much encouraged to play with such matrices in $\mathbb R^3$ (where there are not too many possibilties how a nilpotent matrix can look like, but still enough to show what happens).

Having said all this, it would be a sin now not to mention Jordan decomposition. When studying a matrix $A$ you first find its eigenvalues and corresponding eigenspaces. So pick such an eigenspace corresponding to the eigenvalue $\lambda$. Then $A - \lambda$, restricted to this eigenspace, is a nilpotent operator! If this nilpotent operator is zero, then the original matrix $A$ just stretches this eigenspace by $\lambda$. But in general it can perform a lot of nontrivial shuffling (corresponding to the nilpotent part).

  • 6,048
  • 1
  • 28
  • 29
  • I don't follow the beginning of your second paragraph. The identity operator (is there a more trivial one?) has equal eigenvalues and possesses none of the properties you speak of (namely because it *is* diagonalizable). Maybe it is a simple manner of how you've framed your answer, but it could be clarified/reorganized somewhat, I think. – cardinal Sep 10 '12 at 09:35
  • @cardinal: better now? – Marek Sep 10 '12 at 09:40
  • Quote: "First of all, this must mean that some of the matrix's eigenvalues occur more than once. Otherwise the matrix really can't do anything else than simply stretching its eigenvectors by λn." - This is assuming that the underlying field is algebraically closed, or at least that all roots of th characteristic or minimal polynomial are contained in the field. Otherwise there are matrices without any eigenvalues at all. E.g. $\begin{pmatrix}0&-1 \\ 1&0 \end{pmatrix}$ over the reals. – Oliver Braun Sep 10 '12 at 11:36
  • @Olivier: naturally. But it doesn't tell us as much about linear algebra as about the underlying field. I am also assuming that I am working over field (as opposed to general ring) of zero characteristic. Should I spell that out too? I thought this was all implicit (much as in the other answers...). – Marek Sep 10 '12 at 11:44

Diagonalization is shearing any source of origin the x directional and y directional matric elements of rows and column from a source of for example radiation of light rays into squeezing along hypotenuse out of a right angled triangle sides as such resultant a+ib module of shearing.This happens as a result of stretching.It really bends a light ray by its refractive index along input and output planes.On the line of thought this may play the role of light ray invisibility by the Einstein gravity of bending typically applicable in Quantum mechanics.A sort of squeezing or shearing along hypotenuse side.When all the elements are diagonally shifted the potential becomes zero and not so when more than zero above the diagonalization of directional matrics elements.This may also be called a twisting along axial planes as a function of twisting angle of ray transfer matrics pave the way for magnification as divergence ,convergence as well as for invisible clocking dynamics using laser beams.


The behavior of linear dynamical systems, both continuous and discrete, can be expressed in terms of the eigenvalues of the relevant matrix, and the expression (and especially the long-term behavior) has some added complications if the matrix is not diagonalizable.

Gerry Myerson
  • 168,500
  • 12
  • 196
  • 359
  • Thank you. But can you show me some simple examples in $\mathbb{R}^2$ that demonstrate how non-diagonalizable matrix can sometimes be more difficult to predict? – xzhu Sep 10 '12 at 05:26