60

When I came across the Cauchy-Schwarz inequality the other day, I found it really weird that this was its own thing, and it had lines upon lines of proof.

I've always thought the geometric definition of dot multiplication: $$|{\bf a }||{\bf b }|\cos \theta$$ is equivalent to the other, algebraic definition: $$a_1\cdot b_1+a_2\cdot b_2+\cdots+a_n\cdot b_n$$ And since the inequality is directly implied by the geometric definition (the fact that $\cos(\theta)$ is $1$ only when $\bf a$ and $\bf b$ are collinear), then shouldn't the Cauchy-Schwarz inequality be the world's most obvious and almost-no-proof-needed thing?

Can someone correct me on where my thought process went wrong?

Second Wind
  • 1,059
  • 1
  • 7
  • 12
  • 23
    The definition of $\cos \theta$ in multidimensional setting is sometimes that, however that does **not prove** it is in $[-1, 1]$ without Cauchy-Schwarz (or its equivalent steps). – Macavity Mar 03 '16 at 10:25
  • Cauchy-Schwarz is actually just a weaker version of that equivalence. Suppose I asked you to prove that equivalence of definitions, how would you do it? – RKD Mar 03 '16 at 10:26
  • 27
    Cauchy-Schwarz inequality is valid in the much more general context of inner product spaces (possible infinite dimensional, such as function spaces $L^2$ or $\ell^2$). – Bernard Mar 03 '16 at 10:32
  • 6
    Cauchy-Swartz is not a geometric identity, it is a metric one. – Masacroso Mar 03 '16 at 10:46
  • 10
    Because someone named it? Anything can be named. – PyRulez Mar 03 '16 at 11:32
  • 14
    In addition to that: any result that just keeps popping up in different contexts deserves to have a name attached to it. – J. M. ain't a mathematician Mar 03 '16 at 12:25
  • 2
    Paraphrasing a famous quote: *In death, each member of Fight Club has a name: his name is Robert Paulson*. Now, being killed in battle is for men what being proven is for theorems, so, *In death, each theorem gets a name: its name is Cauchy-Bunyakovsky-Schwarz*. – Lucian Mar 04 '16 at 06:07
  • 3
    @PyRulez Though sadly all my attempts to rename 0 to Owen's Constant have failed. – Owen Mar 04 '16 at 12:02
  • @Lucian, that should be "After proof, each theorem...", methinks. ;) – J. M. ain't a mathematician Mar 04 '16 at 14:29
  • 1
    Even if your thought process is correct, keep in mind that your thought process is infected with the communication to you of many, many thought processes preceding you. Someone, somewhere, thought these things up first, and when they thought it up, they did not have the advantage of communications from the past that you have. They thought it up *themselves*. – Lee Mosher Mar 04 '16 at 19:33
  • 3
    This question reminds me of my experience with the first, second, and third isomorphism theorems in abstract algebra (e.g., the first one, for groups, says if $f \colon G \rightarrow H$ is a group homomorphism with kernel $K$ then $f$ induces an isomorphism $G/K \rightarrow {\rm im}(H)$). I learned these as results without labels and was surprised later when I learned they were called "first," "second," and "third" since up to that point I never had any need to call them anything. It took teaching for me to figure out which was which. To me they were just "obvious isomorphism theorems." – KCd Mar 05 '16 at 01:27
  • The definition of $\cos \theta$ is a consequence of the Cauchy Schwarz inequality. Not the other way around. – Steven Alexis Gregory Mar 06 '16 at 03:12
  • @StevenGregory Depends on how you look at it. There are definitions of $\cos\theta$ which don't need the C(B)S inequality. But they do coincide, of course. – 5xum Mar 10 '16 at 13:40
  • @5xum : Of course. I should have said "sometmes". – Steven Alexis Gregory Mar 10 '16 at 15:00
  • The Cauchy-Schwarz inequality is trivial once you have defined the inner product and the underlying Hilbert space. Otherwise, it's not. – reuns Nov 23 '16 at 09:20

6 Answers6

118

Side note: it's actually the Cauchy-Schwarz-Bunyakovsky inequality, and don't let anyone tell you otherwise.

The problem with using the geometric definition is that you have to define what an angle is. Sure, in three dimensional space, you have pretty clear ideas about what an angle is, but what do you take as $\theta$ in your equation when $i$ and $j$ are $10$ dimensional vectors? Or infinitely-dimensional vectors? What if $i$ and $j$ are polynomials?

The Cauchy-Schwarz inequality tells you that anytime you have a vector space and an inner product defined on it, you can be sure that for any two vectors $u,v$ in your space, it is true that $\left|\langle u,v\rangle\right| \leq \|u\|\|v\|$.

Not all vector spaces are simple $\mathbb R^n$ businesses, either. You have the vector space of all continuous functions on $[0,1]$, for example. You can define the inner product as

$$\langle f,g\rangle=\int_0^1 f(x)g(x)dx$$

and use Cauchy-Schwarz to prove that for any pair $f,g$, you have $$\left|\int_{0}^1f(x)g(x)dx\right| \leq \sqrt{\int_0^1 f^2(x)dx\int_0^1g^2(x)dx}$$

which is not a trivial inequality.

RGS
  • 9,645
  • 2
  • 18
  • 34
5xum
  • 114,324
  • 6
  • 115
  • 186
  • 34
    I thought it was the Cauchy-Bunyakovsky-Schwarz Inequality...I appeared to have been mistaken. – S.C.B. Mar 03 '16 at 11:22
  • Isn't absolute values missing on the inner product? – Eff Mar 03 '16 at 11:32
  • 11
    I would add that this inequality actually *is* the definition of an angle : the angle between $u$ and $v$ is the only $\theta$ such that $cos(\theta) = \frac{\langle u,v\rangle}{||u||\cdot ||v||}$, the existence being equivalent to the above inequality. – Captain Lama Mar 03 '16 at 11:32
  • 1
    @Eff yeah, thanks – 5xum Mar 03 '16 at 12:08
  • 2
    @CaptainLama Well, it's certainly *one of* the equivalent definitions of an angle. There's also the geometric definition, and luckily, the two match. – 5xum Mar 03 '16 at 12:08
  • What is the geometric definition (in infinite-dimensonal spaces for instance) ? – Captain Lama Mar 03 '16 at 12:22
  • @CaptainLama Not in infinite dimensions, of course. I forgot to wrote that. So you could say either that the inner product definition is a *generalization* of the standard definition of angles (arclength), or that it is **the** definition of angles, which has an equivalent definition in finite dimensions. – 5xum Mar 03 '16 at 12:24
  • 3
    Actually, now that I think of it, you can just say that $u$ and $v$ span a plane, and you can pick your favorite definition of (non oriented, of course) angle in this plane. – Captain Lama Mar 03 '16 at 12:27
  • 3
    Looking at Buniakovsky's paper, in hindsight at least, it does not look like he added anything that Cauchy didn't know. The limit of a discrete inequality, filled in the obvious way with values of a function on an interval, gives the analogue for integrals instead of sums. Maybe the addition of B's name to the inequality is a Russian thing that carried over to the Eastern European countries? – zyx Mar 03 '16 at 22:31
  • @zyx As far as I see it, Cauchy proved the equality only for finitely dimensional spaces, Bunyakovsky proved it for the integral case, and the Schwarz sealed the deal by proving it in general. I think Bunyakovsky's step was an important one in moving the thought away from standard $\mathbb R^n$ vectors into more abstract spaces. – 5xum Mar 03 '16 at 22:42
  • I added a question on the attribution of the inequality. http://math.stackexchange.com/questions/1682180/who-attached-buniakovskys-name-to-the-cauchy-schwarz-inequality – zyx Mar 03 '16 at 23:25
  • 3
    @MXY On the contrary - C-B-S is the correct, chronological sequence that corresponds to successive generalizations and more in line with its less abstract use when the Schwartz part is dropped. C-B-S is the name used in Russian literature when the S part is included. – A.S. Mar 04 '16 at 01:11
  • 2
    Thank you so much, I feel so stupid now (which is the best feeling in the world btw) – Second Wind Mar 05 '16 at 18:43
35
  1. The inequality is ubiquitous, so some name is needed.

  2. As there is no cosine in the statement of the inequality, it cannot be called "cosine inequality" or anything like that.

  3. The geometric interpretation with cosines only works for finite-dimensional real Euclidean space, but the inequality holds and is used more generally than that. That is Schwarz' contribution.

  4. Schwarz founded the field of functional analysis (infinite-dimensional metrized linear algebra) with his proof of the inequality. That is important enough to warrant a name. In terms of consequences per line of proof it is one of the greatest arguments of all time.

  5. The Schwarz proof was part of the historical realization that Euclidean geometry, with its mysterious angle measure that seems to depend on notions of arc-length from calculus, is the theory of a vector space equipped with a quadratic form. That is a major shift in viewpoint.

  6. Stating the inequality in terms of cosines assumes that the inner product restricts to the standard Euclidean one on the 2 (or fewer) dimensional subspace spanned by the two vectors, and that you have proved the inequality for holds for standard Euclidean space of 2 dimensions or less. How do you know those things are correct without a much longer argument? That argument will, probably, include somewhere a proof of the Cauchy-Schwarz inequality, maybe written for 2-dimensions but working for the whole $n$-dimensional space, so it might as well be stated as a direct proof for $n$ dimensions. Which is what Cauchy and Schwarz did.

zyx
  • 34,340
  • 3
  • 43
  • 106
  • 3
    In my opinion, $1.$ is one of the most important aspects, which the other answers barely mention. After all, we have a name for the empty set, for the trivial topology, Euler's formula etc. – Aloizio Macedo Mar 04 '16 at 00:36
  • 1
    @AloizioMacedo "why is it ubiquitous" is an important question. e.g. (3), (4), (5). – djechlin Mar 04 '16 at 01:45
  • 1
    I don't get (3); in fact, i was under the impression that the CS inequality allows us to define the angle between any two elements of an inner produce space. If so, then I'd say that the geometric interpretation works just fine! – goblin GONE Mar 04 '16 at 02:54
  • 6
    @goblin Yes, you can use CS to define an angle. But the point zyx is making is that you cannot start from a geometric interpretation to derive CS *in general*, since in a general setting you don't have a geometrical interpretation to start with. That comes only *after* you prove CS. – bartgol Mar 04 '16 at 06:36
  • 4
    @goblin: in my personal experience (20 years as a mathematician working in functional analysis) the CS inequality is a daily thing, while I have never used the notion of angle in a vector space. – Martin Argerami Mar 04 '16 at 12:12
  • @Martin that's because funcan is divorced from the concrete. As soon as you define correlation between 2 random variables (functions), geometric "angle" interpretation becomes important for intuition of how $\rho_{XZ}$ depends on $\rho_{XY}$ and $\rho_{YZ}$. – A.S. Mar 04 '16 at 19:50
  • If you don't use rotations through an angle, or addition of angles, they are not playing the role of angles as in geometry, and are only a measure of separation. A role for which many other quantities can be used. I think this is what happens in statistics. Maybe rotations are used in the theory of OLS regression but I cannot think of an example. @A.S. – zyx Mar 04 '16 at 19:57
  • @zyx It's as simple as $\theta_{XZ}\in[|\theta_{XY}-\theta_{YZ}|,\theta_{XY}+\theta_{YZ}]$. – A.S. Mar 04 '16 at 20:01
  • 2
    @A.S.: divorced from the concrete? Like in quantum mechanics, which is the most basic description of reality that humans have come up with? Like Sobolev spaces, which are a basic staple of **applied** analysis? Like the finite-method element (computational fluid dynamics)? Like partial differential equations (the language of classical physics)? – Martin Argerami Mar 04 '16 at 20:57
  • @Martin Those are areas of application - some of questionable concreteness ("nobody understands quantum mechanics") - not functional analysis itself. Maybe you can illuminate to me the *concrete* meaning of inner product in funcan or any of the applications you listed. In probability, for example, inner product (properly scaled) is called correlation and describes linear dependence between variable which is a somewhat concrete, tangible, though sometimes very misleading quantity. Thinking of it as an angle is useful. I don't know such examples in other areas so help me out. – A.S. Mar 04 '16 at 22:12
  • @A.S. that says $|\theta_{XY}|$ is a distance metric ("the angles ... are only a measure of separation"). There are many such quantities for random variables and no special reason to use correlation angle as one of them except that it is easily computed. Although it's an "angle" in the sense that inverse cosine function appears in the formula, statistics does not make much use of the "rotations" that leave the inner product Cov(.,.) invariant. You can add the angles as real numbers, but the addition has no statistical meaning as composition of rotations. – zyx Mar 04 '16 at 22:17
17

Cauchy-Schwarz is not just that. The result that you stated is just a special case of Cauchy-Schwarz in Euclidean spaces. But it's still valid in any inner product space, equipped with any inner product. The proof is still easy though, but nobody said that the proof had to be long and difficult to give it a name. The fact is that Cauchy-Schwarz inequality is very useful in many applications, from geometry to probability theory, and that's why it's worth having its own name.

Augustin
  • 7,996
  • 1
  • 16
  • 33
10

In short, it deserves a name, because it is important enough to devote a full book to this inequality: The Cauchy-Schwarz Master Class. An Introduction to the Art of Mathematical Inequalities, 2004, J. M. Steele.

First, and historically, the inequality progressively emerged in three bodies of works, one involving finite sums, the others with integral formulas, in one and two dimensions, where the notion of cosine might be less evident (back then). On page 10 of this book, a glimpse of the story:

Augustin-Louis Cauchy (1789–1857) published his famous inequality in 1821 in the second of two notes on the theory of inequalities that formed the final part of his book Cours d’Analyse Algébrique

This bound [in the form of integrals] first appeared in print in a Mémoire by Victor Yacovlevich Bunyakovsky which was published by the Imperial Academy of Sciences of St. Petersburg in 1859.

In particular, it does not seem to have been known in Göttingen in 1885 when Hermann Amandus Schwarz (1843–1921) was engaged in his fundamental work on the theory of minimal surfaces [with a] need for a two-dimensional integral analog of Cauchy’s inequality.

Often, objects are named afterward, as a recognition of the previous works.

I have discovered the book recently, and I believe it deserves attention, because of the many implications of this inequality, interesting tricks and subtle reasoning. For instance, the book offers an inductive proof in finite dimensions, which he deems novel. There are a few books on "inequalities", not so many on only one of them, especially when considered basic. Because this inequality is paradigmatic. The text:

is designed to coach readers toward mastery of the most fundamental mathematical inequalities.

Cauchy-Bunyakovsky-Schwarz is used in a systematic way to open to the geometry of squares, convexity, the power means ladder, majorization, Schur convexity, exponential sums, and the inequalities of Hölder, Hilbert, and Hardy...

Laurent Duval
  • 6,164
  • 1
  • 18
  • 47
  • 7
    Isn't this a bit circular? You're just shifting the question to "Why does the C-S inequality even have a book written about it?" – Najib Idrissi Mar 03 '16 at 16:46
  • 2
    There was a little bit of irony in my answer, indeed. But often, many objects are named afterward, as a recognition of the previous works. I have discovered the boook recently, and I believe it deserves attention, because of its many implications and subtle reasoning. There are a few books on "inequalities", not so many on only one of them, especially when considered basic. – Laurent Duval Mar 03 '16 at 17:26
  • 2
    @NajibIdrissi noting the existence of a few hundred pages of answer and linking to it isn't circular, though. – djechlin Mar 04 '16 at 01:46
  • @djechlin If the answer requires a full book to be answered, then the question was too broad. If it doesn't require the full book, then key points (other than why the equality is named after these people...) could have least be mentioned in the answer; link-only answers are discouraged. And this answer is not phrased as you suggest: it literally says "It deserves a name, because it is important enough to devote a full book to [it]", not "read this book to understand why". – Najib Idrissi Mar 04 '16 at 12:22
  • 3
    @NajibIdrissi okay, just downvote the answer or something. – djechlin Mar 04 '16 at 18:14
  • It is misleading to suggest that Steele's book is devoted fully to the CS inequality. It is a beautiful book, but while it starts with Cauchy-Schwarz it covers many, many other inequalities. – KCd Dec 27 '18 at 04:54
9

The Cauchy-Schwarz inequality can be stated and proven as a more general algebraic result (i.e. independent of vector spaces) which can then be applied to the components of vectors in inner product spaces.

It says that given two finite sequences of $n$ numbers $(a_i)_{i=1, n}$ and $(b_i)_{i=1, n}$ then $|\sum_{i=1,n} a_i.b_i| \le (\sum_{i=1,n} |a_i|^2)^{1/2}.(\sum_{i=1,n} |b_i|^2)^{1/2}$

See here https://www.math.ucdavis.edu/~hunter/intro_analysis_pdf/ch13.pdf (page reference number.293) for a proof for real numbers which is very easily generalised to complex numbers.

Tom Collinge
  • 7,448
  • 21
  • 49
-1

Flip your argument around. The Cauchy-Schwarz inequality makes thinking about angles seem outmodish, vestigial, obsolete. You don't need protractors to do geometric analysis anymore. You just need algebra: just one quadratic inequality. So much of what your proof using $\cos \theta$ has been doing only requires a simple algebraic inequality, and neither $\cos$ nor $\theta$.

Why algebra is important is out of scope to this answer.

djechlin
  • 5,315
  • 16
  • 32
  • You should back up your claims here by e.g. giving an example or two of how to use the CS inequality to do things that we once needed angles to do. – goblin GONE Mar 04 '16 at 02:51
  • Also, I think you should be more specific. For example, if you're saying that we should replace the notion "angle between $x$ and $y$" with the value $$\frac{\langle x,y \rangle}{\|x\| \|y\|},$$ then you should make this explicit. – goblin GONE Mar 04 '16 at 02:52