111

It's commonly stated that the roots of a polynomial are a continuous function of the coefficients. How is this statement formalized? I would assume it's by restricting to polynomials of a fixed degree n (maybe monic? seems like that shouldn't matter), and considering the collection of roots as a a point in $F^n/\sim$ where F is the field and $\sim$ is permutation of coordinates, but is there something I'm missing? More to the point, where would I find a proof?

At least, I've seen this stated for C (and hence R); is this even true in general -- say, for an algebraically closed valued field (and hence complete non-Archimedean field because those extend uniquely)? I've seen it implied that it's not always true in the non-Archimedean case; is this correct? What's a counterexample? (If this is wrong and it is true in this generality, is it true in any greater generality?)

Harry Altman
  • 4,392
  • 2
  • 22
  • 32
  • 15
    It's not actually true for $\mathbb R$: the roots of $x^2+a$ disappear in a dramatically discontinuous manner when $a$ passes through $0$ from below. (It should be true if you restrict attention to polynomials without multiple roots, though). – hmakholm left over Monica Sep 09 '11 at 21:08
  • Well, I'm assuming that if the field isn't algebraically closed, but the algebraic closure has a natural value/topology extending it (as is the case with **R** or a complete non-Archimedean field), then we look at the roots there; of course then if it's true for the larger field, it's true for the smaller one! So in that sense I'm only asking about the algebraically closed case. – Harry Altman Sep 09 '11 at 21:22
  • 5
    [Wilkinson's polynomial](http://en.wikipedia.org/wiki/Wilkinson%27s_polynomial) suggests that continuity may not prevent instability/ill-conditioning – Henry Sep 10 '11 at 17:23
  • 4
    A proof for $\mathbb{C}$ can be found here http://www.ams.org/journals/proc/1987-100-02/S0002-9939-1987-0884486-8/S0002-9939-1987-0884486-8.pdf – Jyotirmoy Bhattacharya Jan 11 '14 at 16:08

8 Answers8

76

Here is a version of continuity of the roots.
Consider the monic complex polynomial $f(z)=z^n+c_1z^{n-1}+...+c_n\in \mathbb C[z]$ and factor it as $$f(z)=(z-a_1)...(z-a_n) \quad (a_k\in \mathbb C)$$ where the roots $a_k$ are arranged in some order, and of course needn't be distinct.
Then for every $\epsilon \gt 0$, there exists $\delta \gt 0$ such that every polynomial $ g(z) =z^n+d_1z^{n-1}+...+d_n\in \mathbb C[z]$ satisfying $|d_k-c_k|\lt \delta \quad (k=1,...,n)$ can be written $$g(z)=(z-b_1)...(z-b_n) \quad (b_k\in \mathbb C)$$
with $|b_k-a_k|\lt \epsilon \quad (k=1,...,n)$.

A more geometric version is to consider the Viète map $v:\mathbb C^n \to \mathbb C^n $ sending, in the notation above, $(a_1,...,a_n)$ to $(c_1,...,c_n)$ (identified with $z^n+c_1z^{n-1}+...+c_n=(z-a_1)...(z-a_n)$ ).
It is a polynomial map (and so certainly continuous!) since $c_k=(-1)^{k} s_k( a_1,...,a_n)$, where $s_k$ is the $k$-th symmetric polynomial in $n$ variables.
There is an obvious action of the symmetric group $S_n$ on $\mathbb C^n$ and the theorem of continuity of the roots states that the Viète map descends to a homeomorphism $w: \mathbb C^n / S_n \to \mathbb C^n$. It is trivial (by the definition of quotient topology) that $w$ is a bijective continuous mapping, but continuity of the inverse is the difficult part.
The difficulty is concentrated at those points $(c_1,...,c_n)$ corresponding to polynomials $z^n+c_1z^{n-1}+...+c_n$ having multiple roots.

This, and much more, is proved in Whitney's Complex Analytic Varieties (see App. V.4, pp. 363 ff).

Algebraic geometry point of view Since you are interested in general algebraically closed fields $k$, here is an interpretation for that case.
The symmetric group $S_n$ acts on $\mathbb A_k^n$ and the problem is whether the quotient set $\mathbb A_k^n /S_n$ has a reasonable algebraic structure. The answer is yes and the Viète map again descends to an isomorphism of algebraic varieties $\mathbb A_k^n /S_n \stackrel {\sim }{\to} \mathbb A_k^n $.
This is the geometric interpretation of the fundamental theorem on symmetric polynomials.
The crucial point is that the symmetric polynomials are a finitely generated $k$-algebra.

Hilbert's 14th problem was whether more generally the invariants of a polynomial ring under the action of a linear group form a finitely generated algebra. Emmy Noether proved in 1926 that the answer is yes for a finite group (in any characteristic), as illustrated by $S_n$.
However Nagata anounced counterexamples (in all characteristics) to Hilbert's 14th problem at the International Congress of Mathematicians in 1958 and published them in 1959.

Georges Elencwajg
  • 141,326
  • 10
  • 263
  • 437
  • 7
    A nice reference on this subject is Morris Marden's book *The Geometry of the Zeros of a Polynomial in a Complex Variable*. – Mariano Suárez-Álvarez Sep 09 '11 at 22:12
  • I didn't know that book: thanks for the reference, Mariano. – Georges Elencwajg Sep 09 '11 at 22:15
  • OK, upvoted, the proof the book gives is quite simple. But it assumes local compactness, which means (assuming algebraic closure) that it'll only work for C. (OK the general statement it gives will be true for other locally compact fields, but that statement only deals with polynomials that split over the given field.) I suppose I'll accept this, and if I really want to know about other cases I'll ask a new question... – Harry Altman Sep 14 '11 at 01:02
39

I think there might be a proof of your statement using the following complex analysis trick (I don't know if a similar idea could work in $\mathbb{C}_p)$: if $U$ is an open subset with smooth boundary $\partial U$ consider,

$$N_{U}(p) = \frac{1}{2i \pi} \oint_{\partial U} \frac{p'(z)}{p(z)}dz$$

When it's defined, $N_U(p)$ is the number of zeros of $p$ in $U$ counted with multiplicity. Then fix a polynomial $p_0$ of degree $n$, and pick $U$ a neighborhood of its zeros. Then the map $p \mapsto N_U(p)$ is well defined and continuous in a neighborhood of $p_0$, but since it can only take integer values, it's constant and equal to $n$. So if $p$ in that neighborhood has degree $n$, all its roots are in $U$.

Joel Cohen
  • 8,969
  • 1
  • 28
  • 41
  • I know this is an old answer, and I like the general idea, but I have a concern. Doesn't the statement that $p\mapsto N_U(p)$ is continuous somehow uses what we want to prove? I mean if the roots are not continuous, it could be that suddenly one jumps onto $\partial U$ or out of $U$. How can we be sure that this does not happen? How have your established continuity of $N_U(p)$ in the first place? – M. Winter Mar 19 '18 at 11:06
  • @M.Winter : The continuity of $N_U(p)$ with regards to $p$ comes from the integral formula. The only potential problem would be if $p$ had a zero on $\partial U$. To prove that it can't be the case, let's denote $ \|p\|_{\infty, \partial U} = \max_{\partial U} |p|$. Since $p_0$ is continuous and non zero on the compact set $\partial U$, we have $\|p_0\|_{\infty, \partial U} > 0$. And because $p \mapsto \|p\|_{\infty, \partial U}$ is continuous, there's a neighborhood of $p_0$ where this quantity is still positive. – Joel Cohen Mar 25 '18 at 16:11
  • @JoelCohen Maybe I am understanding something wrong, but why should $\|p\|_{\infty,\partial U}>0$ imply that $p$ has no zero on $\partial U$? – M. Winter Mar 25 '18 at 23:27
  • @M.Winter : You're right ! My mistake, it should have been the minimum instead of the maximum (so the notation does not fit), let's say $m(p) = \min_{\partial U} |p|$. We have $m(p_0) > 0$ and still have $m(p) >0$ in some neighborhood of $p_0$. – Joel Cohen Mar 26 '18 at 01:02
  • Which argument is that $N_U(p)$ is the number of zeros of the zeros of $p$ in $U$ counted with multiplicity. – Zbigniew May 25 '18 at 07:50
  • @Zbigniew You can compute $N_U(p)$ using the Residue Theorem and the fact that if $a$ is a root of $p$ of order $n$, then the residue of $\frac{p'}{p}$ at $a$ is $n$. Indeed, we can write $p(z) = (z-a)^n q(z)$ with $q(a) \ne 0$. And we get $p'(z) = (z-a)^{n-1} (n q(z) + (z-a) q'(z))$, so $$\frac{p'(z)}{p(z)} = \frac{1}{z-a} \frac{n q(z) + (z-a) q'(z)}{q(z)}$$ whose residue at $a$ is $$\frac{n q(a) + 0}{q(a)} = n$$ – Joel Cohen May 29 '18 at 09:05
19

I posed this as a problem in a course on local fields I taught a little while ago. One of my students, David Krumm, solved it and wrote it up here. The context of David's solution is that $K$ is an arbitrary normed field, with some chosen extension of the norm to the algebraic closure of $K$. (If $K$ is complete or even Henselian, the norm extends uniquely; in general it does not.) Then he shows that for every $\epsilon > 0$ there exists some $\delta > 0$ so that if you perturb each of the coefficients of your polynomial $f$ by at most $\delta$, every root of $f_{\delta}$ is wtihin $\epsilon$ of some root of $f$ and vice versa. (I didn't think of this until just now, but I guess this is equivalent to saying that the sets of roots are within $\epsilon$ of each other for the Hausdorff metric.) He also shows that if $f$ itself has distinct roots, then for sufficiently small $\delta$ so does $f_{\delta}$ and then you can match up the roots in a canonical way.

After he solved this problem I looked into the literature and found a dozen papers or more on various refinements of it, including some very recent ones. At the moment these papers seem to be hiding from me, but if/when I find them I'll give some references.

Pete L. Clark
  • 93,404
  • 10
  • 203
  • 348
  • On the Hausdorff metric: Huh, so it is. However it seems weaker than saying that some permutation of the roots of $f_\delta$ is within $\varepsilon$ of some permutation of the roots of f. This still may well be enough for what I originally needed, though. :) – Harry Altman Sep 12 '11 at 03:06
11

Suppose $P_a(z)=\sum\limits_{k=0}^na_kz^k$. Taking the partial of $P_a(z)=0$ with respect to $a_k$, we get $$ 0=P_a^{\;\prime}(z)\frac{\partial z}{\partial a_k} + z^k $$ Thus, we get that $$ \frac{\partial z}{\partial a_k}=-\frac{z^k}{P_a^{\;\prime}(z)} $$ The existence of these partial derivatives are guaranteed by the Inverse Function Theorem.

Thus, as long as $P_a^{\;\prime}(z)\ne0$ when $P_a(z)=0$, $\frac{\partial z}{\partial a_k}$ will exist and be finite. Therefore, if $P_a$ has no repeated roots, $\frac{\partial z}{\partial a_k}$ is finite.

This shows that unless $P_a$ has repeated roots, each root is a smooth function of the coefficients.

robjohn
  • 326,069
  • 34
  • 421
  • 800
  • Are you making the assumption that $z$ is a function of the coefficients ? – Joel Cohen Sep 09 '11 at 23:45
  • I am making the assumption that $z$ is a root of $P_a$, i.e. $P_a(z)=0$. As the vector of coefficients, $a$, varies, the root $z$ is also going to vary. So in that sense, yes, $z$ is a function of the coefficients. – robjohn Sep 10 '11 at 00:11
  • I think you proved the following: if $\frac{\partial z}{\partial a_k}$ does exist then it is equal to ... – vesszabo Sep 25 '12 at 20:11
  • @vesszabo: since $P_a^\prime(z)$ exists and $z^k$ exists, if $P_a^\prime(z)\not=0$, the first equation guarantees the existence of $\frac{\partial z}{\partial a_k}$. So I qualify my conclusion by saying the $P_a$ has no repeated roots, which guarantees that if $P_a(z)=0$, then $P_a^\prime(z)\not=0$. – robjohn Sep 25 '12 at 21:00
  • Maybe I misunderstand something. I give an example where the situation is similar. Let $f:(a,b)\to\mathbf{R}$ be invertible differentiable function, its inverse is $f^{(-1)}$. Then $f\circ f^{(-1)}=id$. Taking the differentiation and applying the chain rule we obtain that $f'\circ f^{(-1)}\cdot \left(f^{(-1)}\right)'=1$ from it we have $\left(f^{(-1)}\right)'=...$. However this proof is wrong, because we don't know $\left(f^{(-1)}\right)'$ does exist at all. So my problem is not the repeated roots. – vesszabo Sep 26 '12 at 10:05
  • @vesszabo: if $f\circ g(x)=x$, and $f^\prime(g(x))$ exists and isn't $0$, then $g^\prime(x)$ exists and equals $1/f^\prime(g(x))$. We know that the derivative of a polynomial exists everywhere, so the only thing we need to check is that the derivative is not $0$ at the roots of the polynomial. That is guaranteed when there are no repeated roots. – robjohn Sep 26 '12 at 14:23
  • Would the downvoter care to comment? – robjohn Jun 01 '16 at 12:25
  • @robjohn I know this is an *old* answer, but I think I have to emphasize vesszabo's concern again. In your first line, assume the existence of a function $z(a)$ with $P_a(z(a))=0$, which is okay by the fundamental theorem of algebra. But in the next line you assume that $z(a)$ has partial derivatives, which assumes (partial) continuity, the thing we initially were trying to prove. – M. Winter Mar 18 '18 at 14:59
  • @M.Winter: The existence of these partial derivatives are guaranteed by the [Inverse Function Theorem](https://en.wikipedia.org/wiki/Inverse_function_theorem). I have amended my answer to mention this. – robjohn Mar 18 '18 at 16:29
8

Here is my favorite proof. Let $f:\mathbb{C}^n\to\mathbb{C}^n$ be the map taking $(a_1,\dots,a_n)\in\mathbb{C}^n$ to the coefficients of the monic polynomial $\prod_{i=1}^n(x-a_i)$. This map is clearly continuous and, since it is invariant under permuting the $a_i$, it descends to a continuous map $g:\mathbb{C}^n/S_n\to\mathbb{C}^n$ where $S_n$ acts on $\mathbb{C}^n$ by permuting the coordinates. The claim is then that this map $g$ is a homeomorphism.

It is clear that $g$ is a bijection, since every monic polynomial of degree $n$ factors as a product $\prod_{i=1}^n(x-a_i)$, uniquely up to permuting the factors. So, the hard part is to prove $g^{-1}$ is continuous.

The trick for this is to homogenize the polynomials to extend the maps to projective space so that compactness gives you continuity of the inverse for free. Let us consider $\mathbb{C}$ as a subspace of $\mathbb{CP}^1$ and $\mathbb{C}^n$ as a subspace of $\mathbb{CP}^n$ in the usual way. Then $f$ extends to a map $F:(\mathbb{CP}^1)^n\to\mathbb{CP}^n$ as follows. Identify $\mathbb{CP}^1$ with the projectivization of the space of homogeneous linear polynomials in two variables, and identify $\mathbb{CP}^n$ with the projectivization of the space of homogeneous degree $n$ polynomials in two variables. Then $F$ is just the map which takes $n$ linear homogeneous polynomials and multiplies them together to get a degree $n$ homogeneous polynomial. (To see that this extends $f$, identify $a_i\in\mathbb{C}$ with the homogeneous polynomial $x-a_iy$, so then $F$ maps $(a_1,\dots,a_n)$ to $\prod_{i=1}^n(x-a_iy)$ whose coefficients are the same as those of $\prod_{i=1}^n(x-a_i)$.)

Just like $f$, this extension $F$ in invariant under permuting the inputs, so it descends to a continuous map $G:(\mathbb{CP}^1)^n/S_n\to\mathbb{CP}^n$ which extends $g$. Just like $g$, this map $G$ is easily seen to be a bijection. But now for the magic: since $(\mathbb{CP}^1)^n/S_n$ is compact and $\mathbb{CP}^n$ is Hausdorff, any continuous bijection between them is automatically a homeomorphism! Thus $G$ is a homeomorphism.

There's one detail left to check: we now know that $g$ is a homeomorphism when you consider its domain as a subspace of $(\mathbb{CP}^1)^n/S_n$, but is that subspace topology the same as the quotient topology on $\mathbb{C}^n/S_n$? The answer is yes, because $\mathbb{C}^n$ is open in $(\mathbb{CP}^1)^n$ and invariant under the action of $S_n$, so an $S_n$-invariant open subset of $\mathbb{C}^n$ is the same thing as an $S_n$-invariant open subset of $(\mathbb{CP}^1)^n$ that happens to be contained in $\mathbb{C}^n$.

Eric Wofsey
  • 295,450
  • 24
  • 356
  • 562
8

In the complex case, if we ignore or forbid multiple roots and fix the degree: We can assume without loss of generality that the leading coefficient is always 1 -- normalizing the coefficients is a continuous transformation of coefficient space.

Now, then, the coefficients are a continuous injective function of the roots -- we can find them by multiplying linear polynomials with the given roots together. On the other hand, with the leading coefficient fixed to 1, both the space of possible coefficients and $\mathbb C^n/\sim$ minus multiple-root points are locally just copies of $\mathbb C^n$, so the inverse mapping from coefficients back to roots also has to be continuous.

This argument ought to work in any algebraically closed topological field (or would it? I'm not actually sure how wild a topological field is allowed to be). I'm not quite sure about how well it generalizes to situations involving multiple roots, though. The best arguments for that I can imagine right away are somewhat specific to $\mathbb C$.

hmakholm left over Monica
  • 276,945
  • 22
  • 401
  • 655
  • I think you are first assuming the polynomial to have $n$ roots. Am I right? (The question does not explicitly say $F$ is algebraically closed.) – Srivatsan Sep 09 '11 at 21:50
  • Yes, you're right. I started to write the answer down for $\mathbb C$ and then generalized a bit too quickly. Will edit. – hmakholm left over Monica Sep 09 '11 at 21:56
  • Well, as stated in the comments, I essentially am assuming F is algebraically closed, so that's not problematic. – Harry Altman Sep 09 '11 at 22:26
  • 2
    The coefficients of a monic polynomial are simply symmetric polynomials of the roots, and therefore, continuous. The function from the roots to the coefficients as a map $\mathbb{C}^n\mapsto\mathbb{C}^n$ is definitely injective since the roots are a function of the coefficients. (In other words, I agree with you, but this seems simpler to me.) – robjohn Sep 10 '11 at 01:34
7

I think this problem can be dealt with Rouche's theorem. Recall that Rouche's theorem in complex analysis says

(Rouche's theorem) If $f(z)$ and $g(z)$ are analytic interior to a simple closed Jordan curve $C$ and if they are continuous on $C$ and $$ |g(z)|\leq |f(z)|,\quad z\in C,$$ then the function $f(z)+g(z)$ has the same number of zeros (counted with multiplicity) interior to $C$ as does $f(z)$.

Consider the polynomial $$ f(z)=a_0+a_1z+\cdots+a_nz^n,\quad a_n\neq 0.$$ Let $\zeta$ be a root of $f(z)$ and $\varepsilon>0$. To prove the continuity, we need to show that there exists a $\delta$ such that the perturbed polynomial $$ g(z)=a_0+\delta_0+(a_1+\delta_1)z+\cdots+(a_n+\delta_n)z^n$$ where $|\delta_i|\leq \delta$, has the same number of zeros as $f(z)$ inside the circle $C(\zeta,\varepsilon)$ with center $\zeta$ and radius $\varepsilon$,

We may suppose that $\varepsilon$ is smaller than the distances from $\zeta$ other zeroes of $f(z)$, so that $f(z)$ is nonzero on $C(\zeta,\varepsilon)$. Since the circle is compact, $|f(z)|$ attains its minimum $m>0$ on it.

Let $h(z)=f(z)-g(z)$. Then on $C(\zeta,\varepsilon)$, we have $$|h(z)|=|\delta_0+\delta_1z+\cdots+\delta_n z^n|\leq \delta \sum_{j=0}^{n}(|\zeta|+\varepsilon)^j. $$ Thus if we choose $\delta< \frac{m}{\sum_{j=0}^n(|\zeta|+\varepsilon|)^j}$, then $|h(z)|<|f(z)|$ on $C(\zeta,r)$. Hence by Rouche's theorem, we can conlude that $f(z)$ and $g(z)$ has the same number of zeros inside $C(\zeta,r)$.

Xiang Yu
  • 4,355
  • 1
  • 19
  • 32
6

Here is a link to a short note establishing that the roots of a polynomial are $C^\infty$ functions of the coefficients using the implicit function theorem.

Potato
  • 37,797
  • 15
  • 117
  • 251
  • 2
    I think that that note only deals with simple zeros (degenerate ones prevent the use of the implicit function theorem). (Following your link I discovered [this monography](http://www.springer.com/us/book/9781461200598) which is quite interesting and useful for me, thank you). – Giuseppe Negro Jan 08 '16 at 23:14