Why are the solutions of polynomial equations so unconstrained over the quaternions?

Question

An $n$th-degree polynomial has at most $n$ distinct zeroes in the complex numbers. But it may have an uncountable set of zeroes in the quaternions. For example, $x^2+1$ has two zeroes in $\mathbb C$, but in $\mathbb H$, ${\bf i}\cos x + {\bf j}\sin x$ is a distinct zero of this polynomial for every $x$ in $[0, 2\pi)$, and obviously there are many other zeroes.

What is it about $\mathbb H$ that makes its behavior in this regard to be so different from the behavior of $\mathbb R$ and $\mathbb C$? Is it simply because $\mathbb H$ is four-dimensional rather than two-dimensional? Are there any theorems that say when a ring will behave like $\mathbb H$ and when it will behave like $\mathbb C$?

Do all polynomials behave like this in $\mathbb H$? Or is this one unusual?

[Somewhat related question](http://math.stackexchange.com/questions/68905/square-roots-of-1-in-quaternion-ring). — MJD, Mar 21 '12 at 14:46
Although there are infinitely many roots of $x^2+1$ in the quaternions, if you'd consider roots up to conjugation then you recover a finiteness theorem: all the roots form a single conjugacy class. (Over a field, conjugation is trivial.) In fact, with coefficients on the left in any division ring, a polynomial of degree $n$ has at most $n$ conjugacy classes of roots. — KCd, Mar 22 '12 at 01:29
http://www.math.psu.edu/ballif/assignments/Math%20597%20Graduate%20Seminar/Polynomials_and_Quaternions.pdf summarizes the basic results — Ted, Mar 22 '12 at 02:43
In fact $a{\bf i} + b{\bf j} + c{\bf k}$ is a zero of $x^2+1$ if and only if $a^2+b^2+c^2 = 1$. — MJD, May 23 '13 at 18:41
Some polynomials over the Quaternions have NO Roots. For example, iq-qi - 1. — Paul Hurst, Apr 08 '15 at 02:51
@Ted: Unfortunately, your link is broken now. Can you fix it or give a similar reference? — Torsten Schoeneberg, Jan 02 '19 at 21:10
@TorstenSchoeneberg Looks like that content is now at https://mycourses.aalto.fi/pluginfile.php/164783/mod_folder/content/0/MS-C1080_polynomials_over_quaternions.pdf . Also see the paper https://core.ac.uk/download/pdf/82416336.pdf and its references. — Ted, Jan 03 '19 at 04:24
Tessarines are 4-dimensional, but have at most $n^2$ roots, so, this is not because of 4-dimensionality. — Anixx, Apr 28 '21 at 12:17

score 123 · Accepted Answer · edited Apr 13 '17 at 12:20

When I was first learning abstract algebra, the professor gave the usual sequence of results for polynomials over a field: the Division Algorithm, the Remainder Theorem, and the Factor Theorem, followed by the Corollary that if $D$ is an integral domain, and $E$ is any integral domain that contains $D$, then a polynomial of degree $n$ with coefficients in $D$ has at most $n$ distinct roots in $E$.

He then challenged us, as a homework, to go over the proof of the Factor Theorem and to point out exactly which, where, and how the axioms of a field used in the proof.

Every single one of us missed the fact that commutativity is used.

Here's the issue: the division algorithm (on either side), does hold in $\mathbb{H}[x]$ (in fact, over any ring, commutative or not, in which the leading coefficient of the divisor is a unit). So given a polynomial $p(x)$ with coefficients in $\mathbb{H}$, and a nonzero $a(x)\in\mathbb{H}[x]$, there exist unique $q(x)$ and $r(x)$ in $\mathbb{H}[x]$ such that $p(x) = q(x)a(x) + r(x)$, and $r(x)=0$ or $\deg(r)\lt\deg(a)$. (There also exist unique $q'(x)$ and $s(x)$ such that $p(x) = a(x)q'(x) + s(x)$ and $s(x)=0$ or $\deg(s)\lt\deg(a)$.

The usual argument runs as follows: given $a\in\mathbb{H}$ and $p(x)$, divide $p(x)$ by $x-a$ to get $p(x) = q(x)(x-a) + r$, with $r$ constant. Evaluating at $a$ we get $p(a) = q(a)(a-a)+r = r$, so $r=p(a)$. Hence $a$ is a root if and only if $(x-a)$ divides $p(x)$.

If $b$ is a root of $p(x)$, $b\neq a$, then evaluating at $b$ we have $0=p(b) = q(b)(b-a)$; since $b-a\neq 0$, then $q(b)=0$, so $b$ must be a root of $q$; since $\deg(q)=\deg(p)-1$, an inductive hypothesis tells us that $q(x)$ has at most $\deg(p)-1$ distinct roots, so $p$ has at most $\deg(p)$ roots.

And that is where we are using commutativity: to go from $p(x) = q(x)(x-a)$ to $p(b) = q(b)(b-a)$.

Let $R$ be a ring, and let $a\in R$. Then $a$ induces a set-theoretic map from $R[x]$ to $R$, "evaluation at $a$", $\varepsilon_a\colon R[x]\to R$ by evaluation: $$\varepsilon_a(b_0+b_1x+\cdots + b_nx^n) = b_0 + b_1a + \cdots + b_na^n.$$ This map is a group homomorphism, and if $a$ is central, also a ring homomorphism; if $a$ is not central, then it is not a ring homomorphism: given $b\in R$ such that $ab\neq ba$, then we have $bx = xb$ in $R[x]$, but $\varepsilon_a(x)\varepsilon_a(b) = ab\neq ba = \varepsilon_a(xb)$.

The "evaluation" map also induces a set theoretic map from $R[x]$ to $R^R$, the ring of all $R$-valued functions in $R$, with the pointwise addition and multiplication ($(f+g)(a) = f(a)+g(a)$, $(fg)(a) = f(a)g(a)$); the map sends $p(x)$ to the function $\mathfrak{p}\colon R\to R$ given by $\mathfrak{p}(a) = \varepsilon_a(p(x))$. This map is a group homomorphism, but it is not a ring homomorphism unless $R$ is commutative.

This means that from $p(x) = q(x)(x-a) + r(x)$ we cannot in general conclude that $p(c) = q(c)(c-a) +r(c)$ unless $c$ commutes in $R$ with $a$. ~~So the Remainder Theorem may fail to hold (if the coefficients involved do not commute with $a$ in $R$), which in turn means that the Factor Theorem may fail to hold~~ So one has to be careful in the statements (see Marc van Leeuwen's answer). And even when both of them hold for the particular $a$ in question, the inductive argument will fail if $b$ does not commute with $a$, because we cannot go from $p(x) = q(x)(x-a)$ to $p(b)=q(b)(b-a)$.

This is exactly what happens with, say, $p(x) = x^2+1$ in $\mathbb{H}[x]$. We are fine as far as showing that, say, $x-i$ is a factor of $p(x)$, because it so happens that when we divide by $x-i$, all coefficients involved centralize $i$ (we just get $(x+i)(x-i)$). But when we try to argue that any root different from $i$ must be a root of $x+i$, we run into the problem that we cannot guarantee that $b^2+1$ equals $(b+i)(b-i)$ unless we know that $b$ centralizes $i$. As it happens, the centralizer of $i$ in $\mathbb{H}$ is $\mathbb{R}[i]$, so we only conclude that the only other complex root is $-i$. But this leaves the possibility open that there may be some roots of $x^2+1$ that do not centralize $i$, and that is exactly what occurs: $j$, and $k$, and all numbers of the form $ai+bj+ck$ with $a^2+b^2+c^2=1$ are roots, and if either $b$ or $c$ are nonzero, then they don't centralize $i$, so we cannot go from $x^2+1 = (x+i)(x-i)$ to "$(ai+bj+ck)^2+1 = (ai+bj+ck+i)(ai+bj+ck-i)$".

And that is what goes wrong, and there is where commutativity is hiding.

+1 That exercise is a must for people interested in non-commutative division algebras. It is a bit too easy to fall into the trap that the usual commutative argument would work here as well. — Jyrki Lahtonen, Mar 21 '12 at 17:36
Nice elaborate answer. Just some small nitpicks. To conclude $p(c) = q(c)(c-a) +r(c)$ one only needs that $a$ and $c$ commute (the coefficients of $q$ remain at the left, and the coefficient of $r$ (which is constant) is not multiplied at all. And the problem is not that the Remainder and Factor Theorems fail without commutativity (they can be salvaged, see my answer), but the factorization involved does not survive evaluation. — Marc van Leeuwen, Mar 23 '12 at 10:46

score 32 · Answer 2 · answered Mar 21 '12 at 15:12

The finiteness of the number of roots of a polynomial $f(x)\in K[x]$ where $K$ is a field depends on two interlaced facts:

$K[x]$ is a Unique Factorization Domain: every polynomial $f(x)$ factors in an essentially unique way as a product of irreducibles;
if $f(\alpha)=0$ then $f(x)=(x-\alpha)g(x)$ where $\deg g(x)=(\deg f(x))-1$.

The combination of these two facts (the first one in particular) does not hold anymore if you think the polynomial $f(x)$ as a polynomial with coefficients in the ring $\Bbb H$ of Hamilton quaternions. This is because the latter is not commutative.

You may also ponder on this fact: in a commutative environment the transformation $a\mapsto\phi_h(a)=hah^{-1}$ (conjugation) is always trivial. Not so in $\Bbb H$, again as a side effect of non-commutativity. The point is that if an element $a$ satisfies a certain algebraic relation with real coefficient (such as $a^2=1$), so will all its conjugates $\phi_h(a)$.

Many true observations. I have just felt, like Arturo, that the key difference between commutative and non-commutative cases is that the evaluation map $ev_\alpha:f\mapsto f(\alpha)$ is no longer a ring homomorphism from $K[x]$ to $K$. As you explain, this shows also in the non-unique factorization, but the more fundamental problem is that factorizations cannot be used to determine zeros, because the value of a product of polynomials is **not** the product of values of the polynomial factors. — Jyrki Lahtonen, Mar 21 '12 at 17:42
+1: the point about conjugation above is _extremely_ important. — Qiaochu Yuan, Mar 21 '12 at 18:05

Qiaochu Yuan · Answer 3 · 2012-03-21T18:02:15.490

I would like to emphasize a point which is made in Arturo Magidin's answer but perhaps in different words: if $D$ is a noncommutative division ring, then the ring $D[x]$ of polynomials over $D$ does not do what you want it to do.

If $F$ is a field, then one reason you might care about working with polynomials $F[x]$ is that they describe all the expressions you could potentially get from some unknown $x \in F$ (or perhaps $x \in \bar{F}$ or perhaps something even more general than this) via addition and multiplication.

Why does this break down when you replace $F$ with a noncommutative division ring $D$? The problem is that if you work with some unknown $x \in D$ (or in some ring containing $D$) then $x$, by assumption, doesn't necessarily commute with every element in $D$, so starting from $x$ and adding and multiplying you get not only expressions like $$a_0 + a_1 x + a_2 x^2 + ...$$

but more complicated expressions like $$a_0 + a_{1,0} x + x a_{1,1} + a_{1, 2} x a_{1, 3} + a_{2,0} x^2 + x a_{2,1} x + x^2 a_{2,2} + a_{2, 3} x^2 a_{2,4} + a_{2, 5} x a_{2, 6} x a_{2,7} + ... $$

The resulting algebraic structure is quite a bit more complicated than $D[x]$. Already you can't in general combine expressions of the form $axb$ and $cxd$, so even to describe the expressions you can get by using $x$ once I should've actually written $$a_0 + a_{1,0} x a_{1,1} + a_{1,2} x a_{1,3} + a_{1,4} x a_{1,5} + ....$$

That is, the free object is no longer $D[x]$, but $D\langle x\rangle$, the ring of polynomials in a noncommuting $x$. +1 — Arturo Magidin, Mar 21 '12 at 18:01
Just one remark: whether polynomials in $R[x]$ do what you want when you use $x$ to stand for some unknown value (maybe outside the ring $R$) depends not so much on whether the elements of $R$ commute _among each other_ as on whether _elements of $R$ commute with what you want $x$ to stand for_ (because in $R[x]$ they commute by definition). So for instance it is fine to use polynomials in $\mathbf R[x]$ with $x$ standing for some unknown quaternion (no worse than standing for a matrix), but it's _not possible_ to do so for polynomials in $\mathbf C[x]$, even though $\mathbf C$ is commutative! — Marc van Leeuwen, Mar 15 '13 at 12:33

Marc van Leeuwen · Answer 4 · 2012-05-07T06:46:33.150

I'd like to give a complement to the answers already given, since some of them suggest a relation with more advanced arithmetic topics like Unique Factorization, while this is really based on elementary ring theory only. Notably one has the following

Theorem. Let $R$ be a commutative domain, and $P\in R[X]$ a nonzero polynomial of degree $d$. Then $P$ has at most $d$ roots in $R$.

Normally a commutative domain is called an integral domain (note the curious meaning of "integral"), but I've used "commutative" here to stress the two key properties assumed: commutativity and the absence of zero divisors. (I do assume rings to have an element $1$, by the way.) In the absence of commutativity, even introducing the notion of roots of $P$ is problematic—unless $P$ has its coefficients in the center of $R$ (as is the case in your quaternion example)—as Qiaochu Yuan points out. Indeed for evaluation of $P$ in $a\in R$ one must decide whether to write the powers of $a$ to the right or the left of the coefficients of $P$, giving rise to distinct notions of right- and left-evaluation, and hence of right- and left-roots (and neither form of evaluation is a ring morphism). But even for the case that $P$ has its coefficients in the center $Z(R)$ of $R$, so that right- and left-evaluation in $a$ coincide and define a ring morphism $Z(R)[X]\to R$, the conclusion of the theorem is not valid, as this question illustrates.

The proof of the theorem is based on the following simple

Lemma. Let $R$ be a commutative domain, $P\in R[X]$, and $r\in R$ a root of $P$. Then there exists a (unique) $Q\in R[X]$ with $P=(X-r)Q$, and every root of $P$ other than $r$ is a root of $Q$.

The existence and uniqueness of $Q$ do not depend on $R$ being commutative or a domain: for any ring $R$ one has $P=(X-r)Q$ if and only if $Q$ is the quotient of $P$ by euclidean left-division by $X-r$ and the remainder is $0$, and the latter happens if and only if $r$ is a left-root of $P$ (so properly stated the Factor Theorem does hold for general rings!). But the final part of the lemma does use both commutativity and the absence of zero divisors: one uses that evaluation of $(X-r)Q$ in some root $r'\neq r$ of $P$ can be done in separately in the two factors (this requires commutativity: without it evaluation is not a ring morphism), and then one needs to conclude that one of the factors (necessarily the second) becomes $0$, which requires the absence of zero divisors. Note for noncommutative $R$, that even if $P$ should be in $Z(R)$, the first part fails, since evaluation is only a morphism $Z(R)[X]\to R$, and the factors $X-r$ and $Q$ need not lie in $Z(R)[X]$.

The lemma of course implies the theorem by a straightforward induction on $\deg P$.

One final remark unrelated to the question: since the morphism property of evaluation $Z(R)[X]\to R$ does not help us here, one might wonder what is the point of considering evaluation at all in the absence of commutativity. However note that in linear algebra we teach our students to fearlessly substitute a matrix into polynomials, and to (implicitly) use the morphism property of such evaluation maps $K[X]\to M_n(K)$ where $K$ is a (commutative!) field. This works precisely because $K$ can be identified with $Z(M_n(K))$ (the subring of homothecies).

score 2 · Answer 5 · answered Mar 21 '12 at 14:48

2

This has to do with the fact that $\mathbb H$ is not a field, since the number of zeroes of a polynomial over a field is always bounded by the degree of the polynomial.

answered Mar 21 '12 at 14:48

Johannes Kloos

8,496
3
26
46

score 1 · Answer 6 · edited Aug 15 '21 at 16:12

If we have the equation $x^2+1=0$, then it is enough to consider the subspace $x=(x_1,x_2,0,0)$, since we know that there is a solution in $\mathbb{C} \subset \mathbb{H}$. The additional structure, i.e. when $x_3, x_4 \neq 0$, creates infinitely many solutions.

And speaking about additional structure: the conjugate of a real number is the number itself. Thus, the equation $x\overline{x}-1=0$ has only two solutions in $\mathbb{R} \subset \mathbb{C}$, but there are infinitely many in $\mathbb{C}$.

Why are the solutions of polynomial equations so unconstrained over the quaternions?

6 Answers6

Linked

Related