149

As a former physics major, I did a lot of (seemingly sloppy) calculus using the notion of infinitesimals. Recently I heard that there is a branch of math called non-standard analysis that provides some formalism to this type of calculus.

So, do you guys think it is a subject worth learning? Is it a branch that is growing and becoming interconnected with other branches of math? Does it make calculus any easier?

Mikhail Katz
  • 35,814
  • 3
  • 58
  • 116
  • 2
    My own impression is that analysis of infinitesimals is formal enough, though non-standard sometimes yields more fast solutions of some problems. I haven't heard about striking results using especially this technique, so maybe others will provide them in answers. – Ilya Jul 14 '11 at 17:00
  • 2
    @Scott Vasquez: Important to know about, with *some* (but not necessarily much) detail. But really *learning*, as in being able to use in research? For almost everyone, no. – André Nicolas Jul 14 '11 at 21:08
  • 47
    All mathematics is worth learning ;-). – JT_NL Jul 27 '11 at 18:01
  • 3
    There is another alternative based on nilpotent infinitesimals. It is usually referred to as smooth infinitesimal analysis. It also offers a much more intuitive and geometric way to differential geometry than the standard approaches. – Andrey Sokolov Oct 15 '13 at 03:36

7 Answers7

107

I think this (interesting!) question is yet-another instance of the occasional mismatch of science (and human perception) and formal mathematics. For example, the arguments used by the questioner, and common throughout science and engineering, were also those used by Euler and other mathematicians for 150 years prior to Abel, Cauchy, Weierstrass, and others' "rigorization" of calculus. The point is that the extreme usefulness and power of calculus and differential equations was illustrated prior to epsilon-delta proofs!

Similarly, c. 1900 Heaviside's (and others') use of derivatives of not-differentiable functions, of the "Dirac delta" functions and its derivatives, and so on, brought considerable ridicule on him from the mathematics establishment, but his mathematics worked well enough to build the transatlantic telegraph cable. "Justification" was only provided by work of Sobolev (1930s) and Schwartz (1940s).

And I think there are still severe problems with Feynman diagrams, even tho' he and others could immediately use them to obtain correct answers to previously-thorny quantum computations.

One conclusion I have reached from such stories is that we have less obligation to fret, if we have a reasonable physical intuition, than undergrad textbooks would make us believe.

But, back to the actual question: depending on one's tastes, non-standard analysis can be pretty entertaining to study, especially if one does not worry about the "theoretical" underpinnings. However, to be "rigorous" in use of non-standard analysis requires considerable effort, perhaps more than that required by other approaches. For example, the requisite model theory itself, while quite interesting if one finds such things interesting, is non-trivial. In the early 1970s, some results in functional analysis were obtained first by non-standard analysis, raising the possibility that such methods would, indeed, provide means otherwise unavailable. However, people found "standard" proofs soon after, and nothing comparable seems to have happened since, with regard to non-standard analysis.

With regard to model theory itself, the recent high-profile proof of the "fundamental lemma" in Langlands' program did make essential use of serious model theory... and there is no immediate prospect of replacing it. That's a much more complicated situation, tho'.

With regard to "intuitive analysis", my current impression is that learning an informal version of L. Schwartz' theory of distributions is more helpful. True, there are still issues of underlying technicalities, but, for someone in physics or mechanics or PDE... those underlying technicalities themselves have considerable physical sense, as opposed to purely mathematical content.

Strichartz' nice little book "A guide to distribution theory and Fourier transforms" captures the positive aspect of this, to my taste, altho' the Sobolev aspect is equally interesting. And, of course, beware the ponderous, lugubrious sources that'll make you sorry you ever asked... :) That is, anything can be turned into an ugly, technical mess in the hands of someone who craves it! :)

So, in summary, I think some parts of "modern analysis" (done lightly) more effectively fulfill one's intuition about "infinitesimals" than does non-standard analysis.'

paul garrett
  • 46,394
  • 4
  • 79
  • 149
  • Strichartz's book is indeed a lot fun to learn from. – Mark Jul 14 '11 at 18:36
  • 1
    +1, but *lugubrious* sources? I've seen mathematicians be depressed when talking about life, say, but never in a textbook! – ShreevatsaR Jul 14 '11 at 20:20
  • 8
    Surely we've seen sources whose goal seems to be to convince the reader that it's all just tooo hard, tooo technical, and not good for anything but a vehicle for suffering? :) – paul garrett Jul 14 '11 at 22:47
  • 4
    I largely agree with this: "nonstandard methods" are becoming increasingly important in mathematics, but not so much in the areas of analysis where they were first applied. As to whether NSA helps to reconcile the way infinitesimals are used in physics with rigorous mathematics...I think it mostly depends on how much you enjoy saying words like "infinitesimal". Most of the physical arguments that I have seen are just as easy to formalize using $\epsilon$ and $\delta$ as NSA. – Pete L. Clark Jul 15 '11 at 03:19
  • @paul: Perhaps, but I still wouldn't describe those sources are "lugubrious" — they are depressing, not depressed. :-) – ShreevatsaR Jul 15 '11 at 07:11
  • 8
    A small comment -- the knowledge required for "rigorous use" of something is very different from the knowledge required for "constructing something in ZFC and deriving its foundations". The only logic required for the "rigorous use" of non-standard analysis (beyond that needed for standard analysis) is to be able to discern between "internal" and "external", so as to be able to apply transfer properly. –  Mar 17 '12 at 11:59
  • 3
    @Pete This to me is really the problem with nonstandard analysis-and admittedly,this is coming from a novice with relatively little knowledge of it-I don't see where a basic "intuitive" study of nonstandard analysis produces results that are EASIER TO COMPREHEND and more insightful then the usual classical epsilons and deltas. Indeed,even if teaching this to students is successful,then what? There's a reason it hasn't overthrown the traditional model and for those students to progress in thier studies, they'll have to relearn everything the standard way anyway! So what was achieved here? – Mathemagician1234 Jul 26 '13 at 06:24
  • 4
    @Math: I have not yet mastered NSA myself, but I know something about it. (Since you like books, let me recommend Martin Davis's Dover text on the subject.) There are definitely times where NSA yields some proofs which are either technically simpler or conceptually slicker. The problem is that these rewards do not seem great enough to overcome the standard approach: as you say, every student of analysis will need to use $\epsilon$'s and $\delta$'s *as well*.... – Pete L. Clark Jul 26 '13 at 07:02
  • 3
    To learn either "SA" or NSA takes time and effort. For most ordinary mortals it also requires competent teachers and a clear sense from the community that one ought to learn these subjects. NSA is lacking on the latter two points. **What was achieved** by NSA is more clear on the research level: it has helped leading researchers prove important theorems. So I think a research mathematician needs to try to evaluate whether learning NSA will be a useful weapon in her arsenal, taking cues from recent work in her field. – Pete L. Clark Jul 26 '13 at 07:07
43

It can be shown that non-standard analysis (in the framework of Nelson's internal set theory) is a conservative extension of the usual ZFC. In other words, every theorem provable using non-standard analysis, which can be stated using only "standard" terminology, can be also proved without using non-standard analysis. The well-known example is that of the Invariant Subspace Problem. Bernstein and Robinson was able to resolve a special case using non-standard analysis techniques; upon reading the pre-print, Halmos was able to re-prove the same thing using standard techniques. (The proofs are roughly the same length; B&R's paper is longer because they have to introduce terminology and include an introduction.)

So in terms of practical uses of non-standard analysis, it probably will not give you any fundamentally new insight. But, it will given you a language in which certain analytical manipulations are more easily stated (it is more than just keeping track of $\epsilon$s; if you unravel a lot of theorem statements in standard analysis, you have a bunch of statements consisting of $\forall$s and $\exists$s mixed up in some order, and the ordering is extremely important (think the difference between uniform continuity and continuity). These types of statements often tend to become simpler to state in non-standard analysis). It generally will not make it easier to come up with a proof; but it has the advantage of allowing you to spend less time worrying about $\epsilon$s and more time worrying about the bigger picture. In other words, it can give a tool to make rigorously realising a heuristic easier, but it won't help much in coming up with the heurstics to start with. Doing so comes at a sacrifice: because you are no longer explicitly keeping track of the $\epsilon-\delta$ arguments, you easily lose control of quantitative bounds (but you can retain control of qualitative statements). For a lot of mathematical analysis that is not a big deal, but there are times when the quantitative bounds are useful.

A propos whether it makes calculus any easier: that is for you to decide. From my point of view, because of the need to rigorously introduce the hyperreals and the notion of the standard part of an expression, for an absolute beginner it is just passing the buck. You either learn how to write $\epsilon-\delta$ arguments, or you accept a less-intuitive number system and learn all the rules about what you can and cannot do in this system. I think that a real appreciation of non-standard analysis can only come after one has developed some fundamentals of standard analysis. In fact, for me, one of the most striking thing about non-standard analysis is that it provides an a posteriori justification why the (by modern standards) sloppy notations of Newton, Leibniz, Cauchy, etc did not result in a theory that collapses under more careful scrutiny. But that's just my opinion.

To sum up: it is a useful tool. Learn it if you are interested. But don't think that it is the magic bullet that will solve all your problems.

Willie Wong
  • 68,963
  • 11
  • 143
  • 246
  • 1
    Are most of who study in the field of analysis familiar with the language of non-standard analysis? If one discovers a result using non-standard analysis, is it necessary for a standard proof to be discovered for the result to be widely accepted? –  Jul 14 '11 at 18:39
  • 4
    As far as *I* know, no. Non-standard analysis is not very popular. (My knowledge of it is also rather shallow.) But it is established enough that a peer-reviewed proof using non-standard analysis would be accepted as true; but a standard proof would give more "impact". A lot of analysis is not so much about the final result, but about how to get there (which is why we have so many famous Lemmas). – Willie Wong Jul 14 '11 at 21:44
  • Is there a textbook on nonstandard analysis that includes a proof that it's a conservative extension of standard methods (other than in the trivial sense that the proofs that nonstandard analysis "works" are "standard" proofs)? – Michael Hardy Jul 28 '11 at 00:45
  • 1
    @Michael: I suspect you'd have to look in a textbook in Model Theory for that. But I believe a proof that Internal Set Theory is a conservative extension of ZFC is in Nelson's 1977 paper. – Willie Wong Jul 28 '11 at 02:35
  • 1
    @Willie: Thank you. One frequently hears this assertion about it's being a conservative extension. It's crossed my mind to wonder whether it's something everybody thinks _someone_ else has proved _somewhere_. Nelson is a somewhat interesting character. – Michael Hardy Jul 28 '11 at 03:35
  • 3
    "It generally will not make it easier to come up with a proof; but it has the advantage of allowing you to spend less time worrying about \epsilons and more time worrying about the bigger picture." The part after the semicolon sounds precisely like something that could make it easier to come up with a proof! I guess you mean "fundamentally easier" or something like that, but in my experience a lot of people get weirdly fundamentalist when it comes to NSA. – Pete L. Clark Jul 26 '13 at 07:14
  • 1
    (E.g. some people seem to think that the "conservativeness" of NSA is a knock against its value as a proof technique. But in 99.99% of research mathematics, "new proof technique" does not mean "new axioms independent of ZFC", and searching for the latter when trying to prove a theorem would be extremely weird.) – Pete L. Clark Jul 26 '13 at 07:16
  • 1
    @PeteL.Clark: what I meant (written badly) is trying to capture the difference between intuition and a detailed proof. Your interpretation is exactly what I had in mind: another poor way to phrase it (unfortunately I have no good ways at the moment) is that NSA usually doesn't help with the initial step of figuring out "why" something is true, but it certainly makes justifying the proof at lot easier in many cases. But of course, how each person works on a proof is different; so you may be right in disagreeing with me that the "hard" part is the big picture. – Willie Wong Jul 29 '13 at 08:24
  • Nice post, Willie, but Leibniz's notation wasn't "sloppy". We published some papers on this misconception recently. – Mikhail Katz Aug 06 '14 at 15:13
  • @user72694: I found a long list of publications on your site, is it possible for you to be a little bit more specific which paper I should look at? Thanks. – Willie Wong Aug 19 '14 at 08:13
  • David Sherry and I have two papers arguing that Leibniz's notation was not inconsistent. One is in Notices AMS ('12) and the other in Erkenntnis ('13). You may also be interested in the thread http://mathoverflow.net/questions/178267/salvaging-leibnizian-formalism – Mikhail Katz Aug 19 '14 at 10:15
  • @user72694 thank you for the references. – Willie Wong Aug 20 '14 at 07:41
29

Terence Tao has written several interesting blog posts on how nonstandard analysis can be useful in pure mathematics. My impression of his basic point is this: nonstandard analysis allows you to carry out certain intuitive arguments that don't work in standard analysis, and it shortens the length of certain types of proofs that would otherwise require you to juggle many epsilons.

Nonstandard analysis is also an application of model theory, which has recently gotten the attention of many mathematicians: model theory, as it turns out, is an effective tool for proving things of interest to non-logicians.

Qiaochu Yuan
  • 359,788
  • 42
  • 777
  • 1,145
25

I cannot see how Willie Wong's example of the Bernstein-Robinson result supports his conclusion. It seems to me to do the opposite, and I am not alone here. Halmos admits himself in his autobiography: "The Bernstein-Robinson proof uses non-standard models of higher order predicate languages, and when Abby [Robinson] sent me his preprint I really had to sweat to pinpoint and translate its mathematical insight." Halmos did sweat because, as all of his comments and actions regarding NSA indicate, he was against it for philosophical or personal reasons, and so he was eager to downplay this result precisely because it seemed like support for using NSA, which, at least in Robinson's approach, is nonconstructive due to the reliance on the existence of nonprincipal ultrafilters (also, the compactness theorem relies on some equivalent of the axiom of choice).

Also, the fact that a formal proof of some formula exists (which is precisely what it means to be a theorem) is only trivially relevant to the question of whether a theory might help you find a proof. Besides, who other than automated theorem-provers actually thinks in terms of formal proofs? In my experience, the concepts and tools of a theory, the objects that it lets you talk about, and the ideas that it lets you express are what make a theory useful for proving things.

One thing that the OP might find attractive about NSA is that saying "x is infinitely close to y" is perfectly fine and meaningful -- and it probably means what you already think it means: two numbers are infinitely close iff their difference is infinitely small, i.e., an infinitesimal. You also get things like halos (all numbers infinitely close to some number) and shadows (the standard number infinitely close to some number), which can be fun and intuitive concepts to think with.

For example, here is how the limit of a (hyperreal) sequence is defined. First, sequences are no longer indexed by the natural numbers $\mathbb{N}$. Rather, sequences are indexed by the hypernaturals $^*\mathbb{N}$, which include numbers larger than any standard natural. Such numbers are called infinite (or unlimited). (Warning: this is not the same concept as "infinity" in "as x goes to infinity"; infinite naturals are smaller than (positive) infinity, when it makes sense to compare them.) Now, a hyperreal L is the limit of a sequence $\langle s_n \rangle$ (indexed by $^*\mathbb{N}$!) iff L is infinitely close to $s_n$ for all infinite n.

For another example, consider proofs using "sloppy" reasoning where you end up with some infinitesimal term and so just ignore it or drop it from an equation (provoking derisive comments about "ghosts of departed entities"). In NSA, rather than ignoring the term, you can actually say that it's infinitesimal and end up with a result that is infinitely close to the result of your sloppy alternative. E.g., let (the hyperfunction) $f(x) = x^2$ and consider the (I presume familiar) formula for the derivative, where we will let h be a nonzero infinitesimal:

$$\begin{align} \frac{f(x+h) - f(x)}{h} &= \frac{(x+h)^2 - x^2}{h} \\ &= \frac{x^2 + 2xh + h^2 - x^2}{h} \\ &= 2x + h \\ &\simeq 2x \end{align}$$

The symbol $\simeq$ denotes the relation "infinitely close". This derivation works because, when h is an infinitesimal, a + h is infinitely close to a for any hyperreal a. Under sensible restrictions on f and x, this derivation shows that $2x$ is the standard derivative of $x^2$, as every schoolgirl knows.

A cost-benefit analysis for learning NSA should probably include (i) for a benefit, how interesting or valuable you find the nonstandard concepts and (ii) for a cost, how much work you'll have to do to learn it. The latter will depend on what text or approach you choose. If you are willing to take some things for granted and just use the resulting tools, you can get away with bypassing a good chunk of the model-theoretic machinery (compactness, ultrafilters, elementary extensions, transfer, formal languages). If you understand the ultrapower construction, which constructs the hyperreals as equivalence classes of infinite sequences of real numbers (similar to the construction of the reals from the rationals using Cauchy sequences), then the resulting system behaves like you would expect -- relations and operations are defined componentwise. This part is relatively easy. Alternatively, you can get away with not understanding the construction very well if you are willing to internalize the definitions of the relations and operations on the hyperreals just as axiomatic.

If you want to look into NSA, I would recommend either (a) Goldblatt's Lectures on the Hyperreals if you don't have a strong background or interest in mathematical logic or (b) Hurd and Loeb's Introduction to Nonstandard Real Analysis otherwise. The latter is out of print and sadly about $100 if you want to buy it, but check libraries. It's very thoughtful and well-written. Also, if you are excited about the model-theoretic aspects, look them up in Chang and Keisler's Model Theory book as you go along. Hodges' model theory book is also very good but doesn't cover this material as extensively.

Cheers, Rachel

Rachel
  • 1,148
  • 8
  • 13
  • The ultrafilter lemma and the general compactness theorem are equivalent. Both are weaker than the axiom of choice. – Michael Greinecker Mar 18 '12 at 02:23
  • @MichaelGreinecker, I recall seeing that the ultrafilter lemma (UL) is weaker than AC, but every proof that I've seen of compactness uses some equivalent of AC. Can you point me to some source for the equivalence of UL and compactness or for compactness being weaker than AC? Or maybe this is a simple matter than I should just think about? Thanks! – Rachel Mar 18 '12 at 04:02
  • @Rachel: if you prove the Compactness Theorem using ultraproducts, for instance, then you can see that the only set-theoretic ingredient is the existence of ultrafilters (UL). UL is well-known to be equivalent to the Boolean Prime Ideal Theorem (BPIT), which is in turn well-known to be strictly weaker than the Axiom of Choice. (See for instance the book *Equivalents of the Axiom of Choice*.) This shows that Compactness must be weaker than AC. – Pete L. Clark Mar 18 '12 at 05:32
  • For the other direction...at least morally I think of the Compactness Theorem as being equivalent to the compactness of profinite spaces, and it is not hard to see that the latter is equivalent to UL. I too would prefer a more explicit discussion of this: I taught a short course on model theory a couple of years ago and was frustrated that the topological underpinnings of "compactness" was not made more explicit in the standard texts. – Pete L. Clark Mar 18 '12 at 05:34
  • @PeteL.Clark, yes, this makes sense. I guess whenever I have seen compactness proved via ultraproducts, Zorn's lemma was used nearby to prove UL (extending any filter to a maximal one), so I made the mental note of the dependence. I never made the remaining connections to see that ZL must not have been necessary. Interestingly, the same happens when proving compactness via completeness or Lindenbaum's theorem. You use AC or Tukey's lemma or ZL somewhere along the way. I have the book you mention, so I will look into this. Thanks. – Rachel Mar 18 '12 at 09:55
  • 6
    By the way, somehow I forgot to mention that I agree with your reaction to Willie's answer: the fact that NSA is a conservative extension of ZFC *is a good thing*. 99.9% of working mathematicians are looking for tools that can make finding / writing proofs of theorems easier, not for axioms which expand the domain of provable theorems! – Pete L. Clark May 16 '12 at 21:57
  • Also, if I hadn't already given your answer a +1, I would have just for "as every schoolgirl knows". – Pete L. Clark May 16 '12 at 21:59
  • The x^2 example is nice, but does it generalize? Can you easily prove the chain rule with infinitesimals? – lalala Dec 27 '19 at 19:59
14

I love NSA: but I love it because I find the mathematics fascinating and not because it allows me to prove more analysis theorems.

There is a sense in which certain calculus arguments become more natural when written in the language of NSA. However, once you have enough mathematical maturity to really understand how NSA works you probably won't have too much difficulty translating intuitive arguments that have a natural expression in the language of NSA into $\epsilon$-$\delta$ arguments.

I do recommend learning some non-standard mathematics though, because it's extremely cool. There are various approaches: Nelson-style IST (see Robert's book Nonstandard Analysis or many others), using ultrapowers (there's a book by Goldblatt), or with smooth infinitesimal analysis (A Primer of Infinitesimal Analysis by Bell -- in this theory you drop the law of the excluded middle, so not not A no longer implies A. The infinitesimals are the numbers $x$ for which it is not the case that $x \neq 0$ !!). Whichever way you take you'll meet some amazing stuff -- ultrafilters and Łoś's theorem, toposes, all sorts of fun things. But don't imagine it's a way of making hard analysis problems easy: the law of conservation of difficulty is not easy to get round.

Matthew Towers
  • 9,569
  • 2
  • 31
  • 59
8

Non-standard analysis is a beautiful subject that relates to a lot of mathematical fields. It does make some calculus arguments marginally easier, but that is not a good reason to learn non-standard analysis. Calculus is not that complicated, there is no reason to learn sophisticated methods to prove things you already know how to prove. Well, it is fun, but there are many fascinating things to learn and time is scarce.

It is safe to say that there is no big mathematical field in which knowledge of non-standard analysis is indispensable. But there are areas where non-standard analysis has show itself to be a very powerful tool. The main "consumer" of non-standard analysis is probability theory and related areas in which stochastics matter. The classic contributions in the area are a paper by Peter A. Loeb that showed how one can use non-standard methods to construct standard measure spaces with nice properties, now known as Loeb spaces, and a paper by Robert A. Anderson in which stochastic process theory and stochastic integration is done by carrying out the intuitive idea that Brownian motion is a random walk with infinitesimal step size. A great introduction to these topics, that requires no prior knowledge of NSA, is given by Peter Loeb's article in the Handbook of Measure Theory. A gentle introduction for NSA that has been mentioned already is the book by Goldblatt.

Even if one works in probability theory and has heard about all the wonderful things one can do with Loeb spaces, we know much more about them now and know that many of their nice properties hold for a much larger class of measure spaces. A great introduction to these topics for someone with a basic knowledge of probability theory and NSA is given in the small book Model Theory of Stochastic Processes by Fajardo and Keisler. This is not basic calculus.

Pete L. Clark
  • 93,404
  • 10
  • 203
  • 348
Michael Greinecker
  • 29,902
  • 5
  • 71
  • 127
  • 1
    It sounds like major analysis for advanced graduate students,Michael.I think it's probably-no pun intended-most appropriate for those interested in probability theory and its cutting edge applications. I think mathematicians in other fields would only find it marginally interesting. – Mathemagician1234 Jul 28 '13 at 20:43
3

As a student of calculus that had the perennial trouble with epsilon-delta proofs and manipulations before trying my hand at non-standard calculus, perhaps I can provide a bit of perspective:

I still have trouble with epsilon-delta proofs of for derivative formulae. I think that arises from a lack of familiarity with the rigorous handling of intervals, neighborhoods, sequences and the like and I imagine that a more in-depth treatment of the subject would remedy this.

Infinitesimal calculus (as I have seen it so far) seems like a straightforward extension of the properties of "regular" numbers to "really small" numbers (infinitesimals) and "really big" numbers (infinites). The principles used for working with hyperreal numbers pretty much guarantee that an infinitesimal will behave in the same way you'd expect a "really small" real number to behave, while infinites behave as you'd expect "really big" real numbers to behave. This makes the proofs of the basic differentiation formulae much easier both to understand and to read.

I also disagree with the complaint that one needs a good grasp of set theory and mathematical logic to be able to work with hyperreals. In this sense I think of hyperreals the same way I think of imaginary numbers, or sausages. While it can be illuminating to know how the sausage is made, one does not need to know that to find the sausage tasty. As a tutor of lower-level mathematics I can say that many, if not most of the students can handle imaginary numbers just fine without knowing how to construct the complex number system.

In short, I find non-standard calculus worth learning because it makes the arguments more understandable, easier to follow and more intuitive. However, if you are already comfortable with epsilon-delta calculus I don't know if you'd find it of much use.

Jorge Medina
  • 123
  • 1
  • 8