15

Though the title seems clear enough, I'd like to start with a discussion of how I personally came to derive the Jordan Normal Form, because my question is very specific to the details of my derivation.

Notation

To start, let $X$ be a finite dimensional vector space, $L(X)$ be the space of linear operators on $X$, and $A\in L(X)$. Let $\sigma(A) = \{\lambda_1,\ \cdots,\ \lambda_k\}$ be the spectrum of $A$. Now, we define

  • $d(\lambda)$ to be the geometric multiplicity of $\lambda$
  • $m(\lambda)$ to be the algebraic multiplicity of $\lambda$

Next, we denote the $k$th generalized eigenspace of $\lambda$ by $$ \text{N}_k(\lambda) = \text{Ker}(A-\lambda I)^k $$ and finally, we let $$ \text{N}(\lambda) = N_{n(\lambda)}(\lambda)\qquad n(\lambda)=\min\{k\in\mathbb{N}\ |\ \text{N}_k(\lambda)=N_{k+1}(\lambda)\} $$ we note that it can be shown that $n(\lambda) = m(\lambda)$, and so the notation $n(\lambda)$ won't really be used.

We will also let $\sum_\lambda$, $\prod_\lambda$, etc. represent the sum/product/etc. over distinct eigenvalues of $A$.

Fundamentals

First off, it is known that we can decompose $X$ as $$ X = \text{N}(\lambda_1)\oplus\cdots\oplus\text{N}(\lambda_k) $$ Hence $\sum_{\lambda} \dim\ \text{N}(\lambda) = \dim X$. Also, from the characteristic polynomial of $A$, the sum of the algebraic multiplicities of the eigenvalues must equal the degree of the polynomial, which is $\dim X$. Thus $$ \sum_\lambda\dim\ \text{N}(\lambda) = \sum_\lambda m(\lambda) = \dim X $$ Going in a different direction, we present the following theorem:

Theorem: If $B\in L(X)$ is nilpotent of order $n$, and $S\subset X\backslash\text{Ker} B^{n-1}$ is linearly independent, then $$ \bigcup_{x\in S}\{x,\ Bx,\ B^2x,\ \cdots,\ B^{n-1}x\} $$ is linearly independent.

Proof: We will show the case for $|S|=2$, and the general case follows the same format. Suppose $S = \{x,\ y\}$, and $$ \sum_{k=0}^{n-1} a_k B^kx_1 + \sum_{k=0}^{n-1}b_k B^kx_2 = 0 $$ applying $B^{n-1}$ to both sides gives $$ B^{n-1}\left(\sum_{k=0}^{n-1}a_kB^kx_1+b_kB^kx_2\right) = a_0B^{n-1}x_1+b_0B^{n-1}x_2 = B^{n-1}(a_0x_1+b_0x_2) = 0 $$ so $a_0x_1 + b_0x_2\in\text{Ker}B^{n-1}$. However, since $\text{Ker}B^{n-1}$ is a subspace of $X$, we can decompose $X$ as $X = \text{Ker}B^{n-1}\oplus Z$ for some vector space $Z$, for which $\{x_1,\ x_2\}\subset Z\backslash\{0\}$. Since $Z$ is a subspace, $a_0x_1+b_0x_2\in Z$. To say that $a_0x_1+b_0x_2\in \text{Ker}B^{n-1}\cap Z$ is equivalent to saying $a_0x_1+b_0x_2 = 0$. By linear independence of $S$, $a_0=b_0=0$. This process can be repeated to get $a_j=b_j=0$ for all $j$. $\blacksquare$

Now, take $x\in \text{N}(\lambda)\backslash \text{N}_{m(\lambda)-1}(\lambda)$. Note that $B_\lambda = (A - \lambda I)|_{\text{N}(\lambda)}$ (that is, $A - \lambda I$ restricted to $\text{N}(\lambda)$) is nilpotent of order $m(\lambda)$. Hence $\{x,\ B_\lambda x,\ \cdots,\ B_\lambda^{m(\lambda)-1}x\}$ is linearly independent, and it's span is a subspace of $\text{N}(\lambda)$. Hence $\dim \text{N}(\lambda) \ge m(\lambda)$.

If we suppose that $\dim\text{N}(\lambda) > m(\lambda)$ for at least one $\lambda\in\sigma(A)$, then we contradict the fact that $\sum_\lambda\dim\text{N}(\lambda) = \dim X$, and so we conclude that $m(\lambda) = \dim\text{N}(\lambda)$.

Alright, so far so good I hope...

Jordan Normal Form

By the above arguments, we conclude that $\text{Span}\{x,\ \cdots,\ B^{m(\lambda)-1}x\} = \text{N}(\lambda)$. Hence, if we let $e_0(\lambda)\in N(\lambda)\backslash N_{m(\lambda)-1}(\lambda)$, and $e_k(\lambda)=(A-\lambda I)^k e_0(\lambda)$, then $$ \text{Span}\left(\bigcup_{\lambda}\bigcup_{k=0}^{m(\lambda)-1}\{e_k(\lambda)\}\right) = X $$

Since $X = \text{N}(\lambda_1)\oplus\cdots\oplus\text{N}(\lambda_k)$, and each $\text{N}(\lambda_k)$ is $A$-invariant (that is $A(\text{N}(\lambda_k))\subseteq \text{N}(\lambda_k)$), it follows that if we have bases for each $N(\lambda_i)$, then we can get the following matrix representation of $A$ wrt the union of these bases: $$ A = \left[\begin{matrix} A|_{\text{N}(\lambda_1)} & O & \cdots & \vdots \\ O & A|_{\text{N}(\lambda_2)} & \cdots & \vdots \\ \vdots & \vdots & \ddots & \vdots \\ \cdots & \cdots & \cdots & A|_{\text{N}(\lambda_k)} \end{matrix}\right] $$ where $A|_{\text{N}(\lambda_i)}$ is the matrix representation of $A$ restricted to $\text{N}(\lambda_i)$ wrt the basis of $\text{N}(\lambda_i)$.

Above, we demonstrated that $\{e_{m(\lambda)-1}(\lambda),\ \cdots,\ e_1(\lambda)\}$ is a basis for $\text{N}(\lambda)$. We can find a matrix representation for $A|_{\text{N}(\lambda_i)}$ by noting that $$ Ae_k(\lambda) = A(A-\lambda I)^ke_1(\lambda) = (A-\lambda I)^{k+1}e_1(\lambda) + \lambda(A-\lambda I)^ke_1(\lambda) \\ Ae_k(\lambda) = e_{k+1}(\lambda)+\lambda e_k(\lambda) \\ Ae_{m(\lambda)-1}(\lambda) = \lambda e_{m(\lambda)-1}(\lambda) $$ and so $$ A|_{N(\lambda)} = \left[\begin{matrix} \lambda & 1 & 0 & \cdots & 0 \\ 0 & \lambda & 1 & \cdots & 0 \\ 0 & 0 & \lambda & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & 1 \\ 0 & 0 & 0 & \cdots & \lambda \end{matrix}\right] $$

These $A|_{N(\lambda)}$ are the Jordan Blocks, and the matrix representation of $A$ above is the Jordan Normal Form.

Main Question

I'm pretty content with this derivation, nothing seems confusing or out of place or contradictory or nonrigorous, at least at a surface level. I would not be asking this question if I didn't go to the Wikipedia page on the Jordan Normal Form and see this line:

The number of Jordan blocks corresponding to $\lambda$ of size at least $j$ is $\dim \text{Ker}(A - \lambda I)^j - \dim \text{Ker}(A - \lambda I)^{j-1}$.

My "derivation" doesn't account for the fact that there can be multiple Jordan Blocks corresponding to the same eigenvalue. So, in the broadest sense possible, why? What don't I account for?

My idea was that I "assumed" that $\text{Span}\{x,\ \cdots,\ \text{B}_\lambda^{m(\lambda)-1}x\} = \text{N}(\lambda)$. If there are more elements in the basis for $\text{N}(\lambda)$ than this, then there are more Jordan blocks. But if $\text{N}(\lambda)>m(\lambda)$, then the decomposition of $X$ into the direct sum of generalized eigenspaces fails, since the dimensions don't add up. My only other guess is that $\{x,\ \cdots,\ \text{B}_\lambda^{m(\lambda)-1}x\}$ can be "broken down" in some sense into the union of smaller bases which then produce more Jordan blocks, but I can't quite see where to go with that.

Any help would be appreciated. Thank you for your time!

user3002473
  • 8,449
  • 6
  • 29
  • 60
  • I suggest you run through your "proof" in the case of the zero transformation (or the identity if you prefer). You'll then see where you've lost the smaller, possibly repeated, blocks. – ancient mathematician Apr 18 '17 at 15:45
  • @ancientmathematician I think I see it: is it because $B_\lambda$ can have a nilpotency order less than $m(\lambda)$, which means $\{x,\ \cdots,\ B_\lambda^{m(\lambda)-1}x\}$ contains $ – user3002473 Apr 18 '17 at 16:18
  • Yes, you've got the difficulty. I usually do an induction once I get one Jordan block. – ancient mathematician Apr 18 '17 at 17:19

1 Answers1

3

There are several mistakes in what you wrote but the critical mistake is that you claim that $m(\lambda) = n(\lambda)$. It is not true that the index at which $\ker (A - \lambda I)^k$ stabilizes is the algebraic multiplicity of $\lambda$. For example, consider the nilpotent matrix

$$ A = \begin{pmatrix} 0 & 1 & 0 \\ 0 & 0 & 0 \\ 0 & 0 & 0 \end{pmatrix}. $$

The characteristic polynomial of $A$ is $x^3$ so $m(0) = 3$ while we have $A^2 = 0$ so $n(0) = 2$. This is the phenomenon which causes the appearance of several Jordan blocks because if you pick $x \in \mathbb{R^3} \setminus \ker(A)$ (for example $x = e_2$) then $\{ x, Ax \}$ will be linearly independent but $A^2x = 0$ so you don't have enough vectors to form a basis and you'll need to adjoin another block.

levap
  • 62,085
  • 5
  • 69
  • 107
  • 1
    That makes sense, thanks for the response! Care pointing out the other mistakes I made? If you have time of course. – user3002473 Apr 18 '17 at 16:34
  • One thing I noticed is that the theorem you quote is wrong as stated (but a variant of it is true). Take $B$ to be $A$ above and let $S = \{ e_2, e_1 + e_2 \}$ (where $e_i$ are the standard basis vectors). Then $S$ is linearly independent but $\{ e_2, B(e_2), e_1 + e_2, B(e_1 + e_2) \} = \{ e_2, e_1, e_1 + e_2, e_1 \}$ is not linearly independent. It is instructive to see where your proof of the theorem fails for this example. – levap Apr 18 '17 at 16:46
  • 3
    Also, when working on the Jordan theorem, it is really helpful to break it into smaller pieces - it reduces significantly the clutter of notation and makes it easier to identify mistakes. The proof of the Jordan normal form has two parts which can be completely decoupled: The first part is $X = N(\lambda_1) \oplus \dots \oplus N(\lambda_k)$ in your notation. Once this is done, you can completely forget about it and concentrate on the case where $A$ is a nilpotent operator on $X$. – levap Apr 18 '17 at 16:51