I am looking for examples of problems that are easier in the infinite case than in the finite case. I really can't think of any good ones for now, but I'll be sure to add some when I do.

Martin Sleziak
  • 50,316
  • 18
  • 169
  • 342
  • 102,465
  • 18
  • 125
  • 258
  • What counts as an `infinite case`? Is for example a Taylor series `infinite`, and its partial sums `finite`? – dxiv Aug 09 '16 at 02:30
  • 5
    It’s been a while since I’ve thought of this, but I think that “any two bases of a vector space have the same cardinality” qualifies. – Lubin Aug 09 '16 at 02:44
  • 2
    People can compute cohomology/homology/homotopy stuff of stable objects easier than they can compute such invariants of their unstable/finitary versions. – Pedro Aug 09 '16 at 03:53
  • 3
    In general, it is sometimes easier to answer an "infinite" question first (like "are there infinitely many prime numbers?"), the "infinite" question being both easier and more natural, and then it turns out that the "finite" case, which at first was a secondary consideration, is much more difficult. The only example I can think up right now is that of prime numbers (which it turns out is not as good an example as I first thought, since there are combinatorial ways of computing values of $\pi(n)$), but I'm sure more exist. – Will R Aug 09 '16 at 08:09
  • 1
    My first thought was the classification of simple finite groups versus the classification of simple Lie groups, though that's not exactly finite vs. infinite as Lie groups are just a small subset of the infinite groups. – pregunton Aug 09 '16 at 08:23
  • Although it seems irrelevant, but this question reminded me of Richard K. Guy's [famous statement](https://en.wikipedia.org/wiki/Strong_Law_of_Small_Numbers): _There aren't enough small numbers to meet the many demands made of them._ – polfosol Aug 09 '16 at 09:43
  • @WillR Please do turn this into an answer. – John Dvorak Aug 09 '16 at 09:45
  • @JanDvorak: I would love to, and maybe I will if nobody else does and I have enough time to think about it, but I don't think I have enough experience to really give an insightful answer. For some reason it strikes me as something that Timothy Gowers or Terence Tao might have written about on one of their blogs (it's a topic that lends itself well to that kind of writing, I think), but I don't recall reading anything on the topic before. – Will R Aug 09 '16 at 14:08
  • 2
    Many finite Ramsey style theorems are harder to prove than some of the infinite ones... – Stefan Mesken Aug 09 '16 at 15:28
  • Also, even though it's probably not exactly what OP had in mind: Using the Gale-Stewart theorem, which deals explicitly with infinite games, it's easy to see that many finite games have a winning strategy. On the other hand, it's not at all clear how to prove this without referring to infinite games. This leads to some weird phaenomena. For example, we know that chess (it may actually be some slight variation of chess) admits a winning strategy for white or black, but we don't know which one it is. – Stefan Mesken Aug 09 '16 at 15:34
  • The question is a big-list, and very broad, but a *useful* big list. Voting to reopen. – 6005 Aug 09 '16 at 22:59
  • these sorts of questions are all over the place in economics – MichaelChirico Aug 10 '16 at 14:02
  • Basic arithmetic on a finite set $M= \{ 0, 1, 2, ... Max\}$ where the successor of $Max$ is undefined is pretty much impossible. Even in basic arithmetic it is really convenient to have an infinite supply of numbers -- the ultimate bookkeeping device! – Dan Christensen Aug 11 '16 at 03:03
  • I must say, I find the amount of responses (their detail, the work put into it etc.) quite fascinating with respect to the short and quite unspecific question. At least it triggered a lot of thought processes. ;) – AnoE Aug 12 '16 at 14:54
  • It 's easier to construct an infinite example of a 2D network w/Dirichlet-to-Neumann map squared equal minus Laplacian then a finite one: https://en.wikibooks.org/wiki/On_2D_Inverse_Problems/An_infinite_example – DVD Aug 13 '16 at 02:11
  • Maybe one more example can be building classical geometry using infinite lines and infinite space – Pavel Jan 12 '17 at 22:14
  • estimating ``1/n`` is easier for infinite ``n`` than for finite ``n``. Does this qualify? – PatrickT Nov 05 '17 at 06:40

26 Answers26


One can compute the value of $$\int_{0}^{\infty}e^{-t^{2}}\,\mathrm{d}t$$ exactly. This is known as the Gaussian integral, and it has its own Wikipedia page. The answer turns out to be $\frac{1}{2}\sqrt{\pi}.$

But one cannot do the same with $$\int_{0}^{x}e^{-t^{2}}\,\mathrm{d}t$$ because the antiderivative of the integrand is not an elementary function. This is why we gave a name to the error function $$\mathrm{erf}(x) = \frac{2}{\sqrt{\pi}}\int_{0}^{x}e^{-t^{2}}\,\mathrm{d}t,$$ which also has its own Wikipedia page.

In that sense, the infinite case is easier than the finite case.

Addendum: The same phenomenon occurs for variants of this integral, in particular we can transform the integrand to evaluate $$\int_{-\infty}^{\infty}ae^{-(t-b)^{2}/(2c^{2})}\,\mathrm{d}t = \sqrt{2}a\lvert c\rvert \sqrt{\pi}$$ as detailed here on Wikipedia.

Will R
  • 8,668
  • 4
  • 18
  • 36
  • I don't think this is what the questioner meant. – jwg Aug 09 '16 at 08:31
  • 33
    @jwg: Thanks for your feedback. What do you think the questioner meant? What criteria does this answer fail to meet? – Will R Aug 09 '16 at 08:32
  • 11
    Put another way, if this answer the question that the OP intended, then it was an uninteresting question. – jwg Aug 09 '16 at 12:55
  • 18
    @jwg: You haven't answered my questions. How am I supposed to improve this answer, or give a better answer, if I am interpreting the question incorrectly? If indeed I am interpreting the question incorrectly, it would help if you (or anyone else) could give me some advice on where I am going wrong. That being said, I feel an impulse to point out that you have not said the same about any of the other answers so far: what makes this answer different to the others? – Will R Aug 09 '16 at 13:42
  • 14
    @jwg, put another way, what meaning would make the question interesting? – Paul Draper Aug 09 '16 at 13:46
  • 5
    @WillR I assume the question is looking for examples of theorems which are easy to prove about infinite sets/rings/fields/groups/etc., but harder to prove for finite objects. – Tavian Barnes Aug 10 '16 at 13:32
  • 3
    @TavianBarnes: That's a fair enough interpretation. I took the word "problems" in the question a little more broadly than you, as, apparently, so did some others. Probably the question as currently phrased is a little too broad; but it's obviously the kind of question to incite some interesting answers, so I haven't voted to close. I'll admit my answer is somewhat mundane since almost everybody knows about it, but it strikes me as the canonical example of a "problem" which is "easier" in the "infinite case" than in the "finite case." – Will R Aug 10 '16 at 14:30
  • 4
    Nothing wrong with Will R's interpretation of "problems". The theme of integration over an infinite versus a finite domain brings to mind the problem of finding the electric field produced by an infinite plane of uniform surface charge; the fact that the plane is infinite means there is symmetry which allows for Gauss's law to be used. On the other hand, finding a closed-form expression for the electric field produced by a finite plane of charge is a horrible problem. –  Aug 10 '16 at 16:00

First thought would be series vs. partial sums (or improper vs. definite integrals) e.g.

$$ \sum_{k=0}^{\infty} \frac{1}{k!} \;=\; e $$

$$ \sum_{k=0}^{n} \frac{1}{k!} \;=\; \frac{e \cdot \Gamma(n+1,1)}{n!} $$

  • 68,892
  • 6
  • 58
  • 112
  • I'm not really so sure, calculating $e$ is impossible. – Asinomás Aug 09 '16 at 03:22
  • 6
    @CarryonSmiling Calculating the incomplete gamma function is even *more* impossible than calculating `e` itself ;-) It's not entirely clear what you meant by that `easier` in the title, which I guess is part of the game. – dxiv Aug 09 '16 at 03:27
  • 6
    @CarryonSmiling: when you say calculating $e$ is impossible, you have a limited view of calculation. The first line here is either the definition of $e$ or a theorem depending on the route you take. In either case it defines a specific real number. You are saying, correctly, that in finite time we can only calculate finitely many decimal places of $e$, but defining that constant is very useful even if we cannot find its decimal (or binary) representation. – Ross Millikan Aug 09 '16 at 03:32
  • 3
    We name certain reals, like $e$ and $\pi$ because they are useful. It doesn't matter that we don't have the whole decimal expansion. What matters is they are the same real that shows up lots of places, so we can use what we have learned about them in one place to help with another place. – Ross Millikan Aug 09 '16 at 03:37
  • 15
    This is misleading: you can define e as such as sum. In that sense, there is no evaluation there. – Pedro Aug 09 '16 at 03:52
  • 3
    @dxiv what do you mean by the second argument in the gamma function, i.e. what is $\Gamma(\cdot,m)$? – Therkel Aug 09 '16 at 06:31
  • 2
    @Therkel The "upper" [Incomplete Gamma Function](http://mathworld.wolfram.com/IncompleteGammaFunction.html). – dxiv Aug 09 '16 at 06:37
  • 2
    @CarryonSmiling : Define "calculating $e$"? Decimal expansions are a thing had by numbers, but numbers are not decimal expansions.Octal expansions are a thing had by numbers, but numbers are not octal expansions. Otherwise $10_{10}$ and $12_8$ are *different* numbers (which is false). – Eric Towers Aug 09 '16 at 14:50
  • 1
    I don't understand how this example answers the question. The first number is irrational and can't be calculated exactly. But the second number is rational and pretty easy to write as a fraction – Jay Aug 10 '16 at 01:25
  • @Jay Please show us the `pretty easy to write` fraction for $n = 10^6$ ;-) The question was *not* about exact calculations, and my answer was *not* about exact calculations. Rather, it's an example of something that's simpler in the infinite case, and could count as *easier* in the context of OP's question. – dxiv Aug 10 '16 at 01:37
  • 2
    I mean, for any $n$ the fraction is $\displaystyle \frac{\sum_{k=0}^n \prod_{i=k+1}^n i}{n!}$ which is a ratio of two integers. I still think this is an easier problem than the infinite case, which I thought was the point of the question. But that's just my opinion – Jay Aug 10 '16 at 01:49

$\min c^Tx$

subject to $x \in P$

where $P$ is a bounded polyhedral can be solved easily by linear programming method. Ellipsoid method shows that the problem is in $P$ class.

$\min c^Tx$

subject to $x \in P \cap \mathbb{Z}^d$

that is imposing integer constraint increase the difficulty of the question.

Here $P$ is an infinite set in the sense that it contains infinitely many points but $P \cap \mathbb{Z}^d$ contains only finitely many points. Linear programming problem is easier than integer programming problem.

Siong Thye Goh
  • 144,233
  • 20
  • 83
  • 146
  • 24
    Isn't it an open problem (P = NP) whether integer programming is actually harder than linear programming? – templatetypedef Aug 09 '16 at 20:17
  • 2
    true. good point. – Siong Thye Goh Aug 09 '16 at 23:49
  • 7
    @templatetypedef But is computational hardness the same thing as hardness to people? – Jacob Wakem Aug 10 '16 at 02:41
  • @Alephnull It's unclear, but it's entirely possible that this problem is actually easier than the continuous case. We just don't know. – templatetypedef Aug 10 '16 at 03:21
  • 6
    @templatetypedef I don't think anyone actually believes that though – Asinomás Aug 10 '16 at 03:25
  • @CarryonSmiling: Donald Knuth thinks P = NP... – user541686 Aug 11 '16 at 02:41
  • @6005: Just to be clear, he quite literally [says](http://www.informit.com/articles/article.aspx?p=2213858) *"I've come to believe that P = N P"*, so that's not my interpretation of anything -- that's exactly what he said and believes. But yes, he doesn't think it would be of practical value. But given that the father of complexity theory already believes something that goes against the mainstream opinion, it's not that unlikely that some people are willing to go the extra step and believe the practical hardness is similar. – user541686 Aug 11 '16 at 05:34
  • @Mehrdad Thanks for referring the quote. I seem to remember a follow-up question where he said that wasn't his main point and I still get the sense he's partially saying it just to be a dissenting voice. It doesn't matter though. I've retracted my previous statement and this is off-topic regardless. – 6005 Aug 11 '16 at 05:38
  • 20
    But the difference in difficulty here is due to "discrete vs. continuous", not "infinite vs finite" – BlueRaja - Danny Pflughoeft Aug 11 '16 at 22:00

Circle packing. Packing in a plane (infinite case) vs. packing in arbitrarily given bounded area (finite case).

Kamil Maciorowski
  • 2,790
  • 1
  • 14
  • 23
  • 2
    This is an *excellent* example, and extremely understandable. Well done. – Wildcard Aug 13 '16 at 02:12
  • Even the packing of square tiles inside a simple (even with integer sides) right triangle is quite challenging – G Cab Aug 19 '16 at 22:44
  • 1
    The infinite case isn't too easy either, though ... In fact there's the issue of even defining what the problem is. What it means to be "the" best packing is apparently unresolved, although you can of course prove a packing has the maximum possible density. – 6005 Sep 02 '16 at 18:21

Proving that a polynomial is zero

Suppose you have an unknown polynomial $f$ of degree $d$, but you are able to say things about the values of $f$.

If you are working over an infinite field, an easy test is to check if $f(a) = 0$ for $d+1$ distinct values of $a$.

If you are working over a finite field of $d$ elements or less, you can have the unenviable problem that $f(a) = 0$ for every $a$, even when the polynomial $f(x)$ is nonzero, so this approach won't work in general.

(example: $f(x) = x^q - x$ when working over the field of $q$ elements)

  • 1
    More explicitly maybe, over finite fields there aren't $d+1$ distinct values to check. Though, when would you have a situation like this? Where can one use this argument? I've never seen it before and I think it would be neat to see. – MCT Aug 09 '16 at 14:06
  • 5
    @MichaelTong It's hard coming up with interesting examples on the spot. But here's one: suppose you have a complex polynomial with the property that $f(a)$ is a real number whenever $a$ is a real number. Let $\bar{f}$ be the polynomial you get by conjugating its coefficients. Then $\bar{f}(a) = f(a)$ whenever $a$ is a real number, and $f - \bar{f}$ as infinitely many roots! Consequently, the coefficients of $f$ must be real numbers. –  Aug 09 '16 at 14:14
  • 6
    @MichaelTong A maybe less interesting but more important application is that one method of doing arithmetic with polynomials is to record its *values* rather than its *coefficients*. In particular, this makes multiplying polynomials easy. If you use $n$ different values, then there is at most one polynomial of degree less than $n$ attaining those values; so if you do arithmetic this way and then later find the coefficients of a polynomial with those values, you can be sure you got the right answer if the degree you were looking for is less than $n$. Keyword: "evaluate-interpolate". –  Aug 09 '16 at 14:19
  • Testing if a polynomial function evaluates to $0$ everywhere requires *fewer* point evaluations for a finite field when $d \geq |F|$, and the same number of tests otherwise. Testing whether an element of $F[X]$ is $0$ requires the same number of the same field operations whether $F$ is finite or infinite. As long as the meaning of zeroness-testing is kept consistent between finite and infinite fields, the infinite case is no easier. – zyx Aug 10 '16 at 03:17

In digital sample analysis (signal, image processing, data science), data is generally discrete and finite. "Discrete" generally means "regularly sampled", and indexed by integer indices. "Finite" here means having a compact support (not talking about the field in which the values are taken, except as a side note). This answer is about the difference in analysing continuous data $x(t)$, $t\in \mathbb{R}$, or discrete data $x[n]$, $n\in [1,\ldots,N]$. Similar problems arise in higher dimensions with $t\in \mathbb{R}^d$ and $n\in [1,\ldots,N]^d$, I will consider only $d\in\{1,2,3\}$.

Tools from harmonic analysis have been used for a while in signal (and image) processing : all sorts of integral transformations, such as Fourier, Radon, Hilbert transforms, and wavelet analysis. Such representations are generally invertible when considered on functions of continuous parameters (like a time $t$). However, the discretization of harmonic analysis transformations with a finite number of samples, which is mandatory for computer operations, while retaining nice properties (orthogonality, invertibility, symmetry) is often complicated.

A standard example is the theory of continuous wavelet transforms. A 1D signal $x(t)$ can be transformed to a 2D domain with a location ($b$) and scale ($a$) variable:

$$ X_{w}(a,b)={\frac {1}{|a|^{1/2}}}\int_{-\infty }^{\infty }x(t)\overline {\psi}\left({\frac {t-b}{a}}\right)\,dt$$ which admits an inversion $$x(t)=C_{\psi }^{{-1}}\int _{{-\infty }}^{{\infty }}\int _{{-\infty }}^{{\infty }}X_{w}(a,b){\frac {1}{|a|^{{1/2}}}}{\tilde \psi }\left({\frac {t-b}{a}}\right)\,db\ {\frac {da}{a^{2}}}$$ with a quite mild admissibility condition (SE.math) on $ \psi$ and $\tilde{\psi}$, which I do not detail here. In plain words, any couple of such functions (avoid pathological ones, still) vanishing in the Fourier domain at $0$, and decaying fast enough at $\infty$, will do the job. So one can use many smooth wiggling functions, with analytical formulas, like Gaussian derivatives, for instance the Mexican hat:

mexican hat wavelet

However, the building of discrete wavelets was difficult, apart from trivial examples in early works by A. Haar in 1910 and P. Franklin in 1928. It took some time before J.-O. Strömberg, Y. Meyer and I. Daubechies produced discrete orthogonal wavelet bases in the eighties. Daubechies wavelets of low order are notoriously not so regular, without analytical formulas, and asymmetric:

Daubechies 4 wavelet

This is for the transformation. But the signal is quantized on bits. It often takes values in $[0,\ldots,2^B-1]$ If we restrict the output to by quantified too, the problem is even more difficult.

Aside, dealing with (and defining) discrete objects on a digital grid is often more complicated than with their continuous counterparts. Think for instance about discrete arcs of circles (in 2D), drawn in "pixels", that do not scale as well as standard circles:

Discrete circle

Those topics are addressed in digital geometry, for which algorithms are developed to draw discrete lines (Bresenham algorithm), discrete planes (in 3D, see below, from 3D Noisy Discrete Objects: Segmentation and Application to Smoothing, 2009, Provot et al.), with less or more thickness, or conics for instance.

Fuzzy discrete plane and bounding planes

Laurent Duval
  • 6,164
  • 1
  • 18
  • 47
  • 1
    It's not obvious to me that your discrete cases are also finite. Are they? – user541686 Aug 16 '16 at 05:34
  • @Mehrdad As for discrete signals, I would be positive. As for digital geometry, positive for bounded objects like the cercle. Do you believe I should add a mention for discrete lines and planes? – Laurent Duval Aug 16 '16 at 06:02
  • Ah sorry, I just meant the signals. It'd be nice if you could plot what signal's wavelet transform you were taking, so that we could see that it's discrete and finite. – user541686 Aug 16 '16 at 06:32
  • @Mehrdad Discrete finite support wavelets boil down to the convolution of two finite support series. As I cannot produce a continuous image, would a finite signal, with a discrete wavelet plot suit your needs? – Laurent Duval Sep 02 '16 at 19:14

Geometric series finite vs. infinite case for $|q|<1$ \begin{align*} \sum_{j=0}^Nq^j=\frac{1-q^{N+1}}{1-q}\qquad\text{vs.}\qquad\sum_{j=0}^\infty q^j=\frac{1}{1-q} \end{align*}

  • 94,265
  • 6
  • 88
  • 219
  • 1
    What's the underlying "problem" here? Is it using the formula to compute the answer, or is it deriving the formula? The former doesn't seem like a "problem" (as I understand the word) and I'm not convinced that the latter is easier in the infinite case. (+1, though, because this is the first thing that came to my mind, too...) – Benjamin Dickman Aug 11 '16 at 19:28
  • @BenjaminDickman: Thanks! It was also my first idea. :-) Underlying problem: Find a closed expression. ... and at least the outcome is simpler. – epi163sqrt Aug 11 '16 at 19:38
  • 2
    The "outcome is simpler" -- yeah, I was just thinking that to "find a closed expression" -- at least the way I have trad'ly done it -- involves first working with the finite case, and then getting to the infinite case by evaluating a limit (i.e., the finite case along with an extra step, hence arguably not "easier"). – Benjamin Dickman Aug 11 '16 at 19:53

The linear-quadratic control problem, that is, finding a function $u(t)$ that minimizes $\int_0^{t_1} x^TQx + u^TRu\,\mathrm{d} t$ under the constraint $\dot{x} = Ax + Bu$.

In the infinite-horizon case $t_1=\infty$, the solution is obtained by solving an algebraic Riccati equation (ARE). In the finite-horizon case $0<t_1 \in \mathbb{R}$, it is obtained by solving a differential Riccati equation (DRE), which is more difficult to solve; for instance, some numerical methods for DREs involve solving an ARE at each time step.

Federico Poloni
  • 3,186
  • 19
  • 31

Let's take a look at Goodstein sequences $G_k(n)$ and their interesting behaviour. We associate to each natural $n$ a sequence $(G_k(n))_{k\geq 0}$.

  • Step $0$: We start by representing $n$ in base $2$ notation and also all powers and powers of powers, so that $n$ is written without using any number greater than $2$.

    Taking $n=266$ we obtain \begin{align*} G_0(266)=2^8+2^3+2^1=2^{2^{2+1}}+2^{2+1}+2^1 \end{align*}

  • Step $k\geq 1$: Replace each occurrence of base $k+1$ in $G_{k-1}(n)$ by $k+2$, represent it in base $k+2$ notation without using any number greater than $k+2$ and subtract $1$

    We obtain \begin{align*} G_1(266)&=3^{3^{3+1}}+3^{3+1}+3^1-1=3^{3^{3+1}}+3^{3+1}+2\simeq 10^{38}\\ G_2(266)&=4^{4^{4+1}}+4^{4+1}+1\simeq 10^{616}\\ G_3(266)&=5^{5^{5+1}}+5^{5+1}\simeq 10^{10921}\\ G_4(266)&=6^{6^{6+1}}+6^{6+1}-1\\ &=6^{6^{6+1}} +5\cdot6^{6}+5\cdot6^{5}+5\cdot6^{4}+5\cdot6^{3}+5\cdot6^{2}+5\cdot6^{1}+5\simeq 10^{217832}\\ G_5(266)&=7^{7^{7+1}} +5\cdot7^{7}+5\cdot7^{5}+5\cdot7^{4}+5\cdot7^{3}+5\cdot7^{2}+5\cdot7^{1}+4\simeq 10^{4871822}\\ \end{align*}

Although this sequence is enormously increasing at the beginning, subtracting $1$ in each step is sufficient that this sequence and in fact each Goodstein sequence will eventually terminate with $0$ (admittedly with $k$ often rather large).

The interesting thing is to prove this behaviour can be done by going from finite to infinite. It is done by using transfinite ordinals and arguing with the well-ordering property. See this nice introductory presentation for more information.

  • 94,265
  • 6
  • 88
  • 219
  • 1
    I can't shake the feeling that "using transfinite ordinals" in this case is easier because we happen to already know a lot of things about transfinite ordinals, enough to establish the mapping between the two. So it qualifies. But out of interest, to someone who isn't already familiar with ordinals, is the proof any easier to follow than a direct proof that hereditary base-2 notation is well-ordered and that each step in the sequence results in a smaller value per this order? I suppose one can argue that since PA doesn't prove the result, in some formal sense the "finite case" is *impossible*! – Steve Jessop Aug 09 '16 at 10:02
  • @SteveJessop: It seems that in this case a proof via transfinite ordinals is in fact *simpler*, since it's easy to associate a *strictly decreasing* sequence with the help of transfinite ordinals as it is indicated in the referenced page. – epi163sqrt Aug 09 '16 at 10:08
  • But just staring at your example, ignore the actual $n$ in the hereditary base-n notation, it's easy to see that the "biggest term" never gets any bigger in our chosen well-order (meaning more complex), only bigger in the usual ordering of the integers, while the -1 repeatedly chips away at the smallest term. This observation might be giving me false hope of a "high-school" proof. I've never attempted it, so I don't know at what point down that road you hit the "tricky bit" that PA doesn't prove, and have to pull in the Axiom of Foundation or whatever. – Steve Jessop Aug 09 '16 at 10:16
  • Since the "sequence" variable is actually the $k$, don't you want to write $G_n(k)$ instead of $G_k(n)$ (in which case the sequence is $G_n$ instead of $(G_k(n))_{k\ge 0}$)? In your example you are listing the values of $G_{266}$. – Mario Carneiro Aug 09 '16 at 17:52
  • @MarioCarneiro: As long as the representation is consistent both notations are fine. I simply took it from this *[Wolfram page](http://mathworld.wolfram.com/GoodsteinSequence.html)*. – epi163sqrt Aug 09 '16 at 18:11
  • @Markus Ah, okay. I wasn't sure where you got your notation; wikipedia seems to prefer $G(n)(k)$ (which also admits the partial application notation). I checked the original [Kirby & Paris paper](http://www.cs.tau.ac.il/~nachumd/term/Kirbyparis.pdf), and they use $G_n(m)$ but for a different function (the base-replacement function); the sequence itself is just denoted $n_k$, i.e. $266_0$, $266_1$, etc., defined by $n_k=G_{k+1}(n_{k-1})$. – Mario Carneiro Aug 09 '16 at 18:22
  • @MarioCarneiro: Nice ref. Thanks! – epi163sqrt Aug 09 '16 at 18:32
  • 4
    It has been shown that the convergence of all Goodstein sequences to $0$ cannot be proven in PA (Peano Arithmetic) unless $\neg Con (PA)$ because it implies $con [PA].$ So without a stronger axiom (e.g. Infinity) we are stuck. – DanielWainfleet Aug 10 '16 at 01:38
  • @ Steve Jessop. See my previous comment about PA and Goodstein. Sorry I can't provide a reference. – DanielWainfleet Aug 10 '16 at 02:34
  • I gave some further introductory details on Goodstein sequences in [this answer.](http://math.stackexchange.com/a/625404/242) – Bill Dubuque Aug 10 '16 at 02:46
  • @BillDubuque: Very nice answer! (+1) Thanks for pointing to it! – epi163sqrt Aug 10 '16 at 06:57

Kolmogorov's Zero-One Law

Informally, in certain probabilities spaces, events which do not depend on any finite amount of information must have probability zero or one.

For example, suppose you flip a coin over and over again infinitely many times. Any event $E$ which is independent from the outcome of any finite number of coin flips must have probability $0$ or $1$. Examples include

  • $E =$ the event that 100 consecutive heads are flipped infinitely times (prob $=1$)
  • $E =$ the event that $\limsup_{n \to \infty} H_n - T_n = \infty$, where $H_n$ ($T_n$ resp.) is the number of heads (tails resp.) in first $n$ tosses. (prob $=1$)
  • $E =$ the event that $\lim_{n \to \infty} H_n - T_n = \infty$. (prob $=0$)
  • $E =$ the event that $\sum_{k=1}^{\infty} \frac{\sigma(k)}{k}$ converges, where $\sigma(k) = 1$ if $k$'th toss is heads and $\sigma(k) = -1$ if $k$'th toss is tails. (prob $=1$)
D Poole
  • 2,682
  • 9
  • 21

The Game of Chomp:

Although a winning strategy is guaranteed to exist, finding a winning strategy for a $a \times b$ is an unsolved problem. However, finding a winning strategy for a $a \times \omega$ ($a\geq3$) board is not that hard.

First note that you lose for a $2 \times \omega$ board, because the only losing position is a board of the form of $a+1$ chocolates and then $a$ chocolates. Your opponent can always make sure that you get such board. Now note that you can always reduce a $a \times \omega$ ($a\geq3$) board to a $2 \times \omega$ board.

  • 24,314
  • 6
  • 55
  • 112
  • 4
    You may want to link to, or explain the Game of Chomp. I certainly didn't know it and I think I'm not alone in that. – SQB Aug 12 '16 at 08:58
  • This result is cute. And it also solves $\infty \times \infty$ board when $a = \infty$ :) – 6005 Sep 02 '16 at 18:26
  • To be picky, but think it should be $a \times \omega$ instead of $a \times \infty$. It has to be a one-directional infinity; the board is implicitly well-ordered in each direction. – 6005 Sep 02 '16 at 18:30
  • @6005 You are certainly right, I tried to explain it too many times to people who didn't know ordinal numbers. – wythagoras Sep 02 '16 at 18:56

The asymptotic distribution of primes: nth prime ≈ n ln n as n → ∞

Try finding a formula for the nth prime in the non-asymptotic case!

  • 12,494
  • 15
  • 48
  • 93
  • 1
    I considered posting an answer along similar lines to this, but I was only thinking in terms of *counting* primes, i.e., evaluating $\pi(n).$ The point is that we know what $\lim_{n\to\infty}\pi(n)$ is by very elementary arguments. But as it turns out, there are practical combinatorial ways of computing $\pi(n)$ even for large values of $n.$ Kudos for finding the interpretation I missed. – Will R Aug 11 '16 at 03:22
  • @WillR: Thanks! :) – user541686 Aug 11 '16 at 05:35

Almost the entirety of calculus and all applications thereof

This needs a bit of explaining: The notion of infinity is an artificial one. There is nothing in reality that we know to be truly infinite¹. So why do we bother with infinity? Because it’s easier to deal with than “very big numbers”! And working with infinity is what calculus (née infinitesimal calculus) is about. Take any word problem in a high-school calculus text book and try to formulate and then solve it without any notion of infinity, continuity, and so on – it will be a pain in the arse.

To be even more concrete, let’s look at the classical setup of exponential growth (e.g., for populations, interest, and so on): You have a quantity $x$ whose growth is proportionally to its current amount and want to describe its long term behaviour. Employing the notion of infinity, you can describe and solve this with the differential equation $\dot{x} = cx$. Without calculus you have to find some finite time $\mathrm{d}t$ and then can state that $x(t+\mathrm{d}t) = (1+c)·x(t)$. (In some cases, e.g., interest applied in daily steps, this may even be more realistic.) So far, there is no big difference in terms of easiness.

But now try to quickly calculate a good approximation of $x(1000 · \mathrm{d}t)$. Without employing the concept of infinity, you probably end up with performing thousands of multiplications. With infinity, you can approximate the behaviour with the exponential function, whose value you can approximate in more efficient ways. Of course, you can formulate some equivalent or almost equivalent statements for obtaining this value without infinity. But neither would the statement itself nor deriving it would be what I would consider easy.

Some of the existing answers provide further examples for this: 1, 2, 3, 4.

¹ or truly infinitesimally small, continuous, etc.

  • 5,210
  • 1
  • 22
  • 43

Another example of an improper integral being "easier" (in this case, by having an elementary result) than a proper one would be integrating along the entire real axis by the residue theorem and Jordan's lemma or estimation lemma

For example, consider: $$f = \frac{e^{i x}}{x^2 + 1}$$ $$\int_{-a}^a f \:\mathrm{d}x$$

For the infinite case ($a \to \infty$), we consider a semicircle with its base along the real axis and curved section in the upper positive imaginary axis, and then take the limit of its radius approaching infinity.

With the residue theorem and Jordan's lemma, we have: $$ \begin{align*} \int_{-\infty}^\infty f \:\mathrm{d}x &= 2 \pi i \operatorname{Res}(f, i) \\ &= 2 \pi i \frac{e^{-1}}{2 i} \\ &= \frac{\pi}{e} \end{align*} $$

For the finite case ($a \in \mathbb{R}^+$), there is no elementary solution, so we have to solve in terms of the exponential integral: $$ \begin{align*} \int f \:\mathrm{d}x &= \frac{i}{2} \int \frac{e^{i x}}{x + i} - \frac{e^{i x}}{x + i} \:\mathrm{d}x \\ &= \frac{i}{2} \int e \frac{e^{i (x + i)}}{x + i} - e^{-1} \frac{e^{i (x - i)}}{x} \:\mathrm{d}x \\ &= \frac{i}{2 e} \left(e^2 \operatorname{Ei}(i x - 1) - \operatorname{Ei}(i x + 1)\right) \end{align*} $$

  • 355
  • 1
  • 3
  • 10

Notation: $$(i). \text { $|A|$ is the cardinal of the set $A.$ }$$ $$(ii). [A]^2 =\{ \;\{x,y\}\subset A: x\ne y\}.$$ $$(iii). f''A=\{f(x):x\in A\}$$ for any function $f$ and any set $A\subset dom(f).$ $$(iv). \text { $\omega$ is the first infinite cardinal.}$$

For finite or infinite cardinals $a, b$ the arrow notation $$(a)\to (b)^2_2$$ means that if $|A|=a$ then for any function $f:[A]^2\to \{0,1\}$ there exists $B\subset A$ with $|B|=b$ and $|f''[B]^2|=1.$.... Observe that if $a'\geq a$ and $b'\leq b,$ then $(a)\to (b)^2_2$ implies $(a')\to (b')^2_2.$

$$\text {Theorem 1. For any $m\in \mathbb N$ there exists $n\in \mathbb N$ such that $(n)\to (m)^2_2.$}$$ Theorem 1 implies that for all sufficiently large $n ,$ any two-coloring of the edges of the complete graph on $n$ points will have a complete sub-graph on $m $ points with edges all the same color.

$$\text {Theorem 2. $(\omega)\to (\omega)^2_2.$}$$ This is fairly easy to prove. Now there is a result in Model Theory: If a theory has arbitrarily large finite models then it has an infinite model. We can use this, and Theorem 2, to prove Theorem 1 by contradiction: If for some $m\in \mathbb N$ we have $\neg [(n)\to (m)^2_2]$ for all $n \in \mathbb N$ then there exists infinite $a$ such that $\neg [(a)\to (m)^2_2],$ contradicting Theorem 2.

Prof. William Weiss once told me that there is a long finitistic proof of Theorem 1, but that the infinite method "uses some technology."

  • 53,442
  • 4
  • 26
  • 68
  • 2
    The finitistic proof of (1) that I know isn't very long, although it gives a bad upper bound. Let $e(1)=1$, $e(n+1)=2^{e(n)}+1$, and suppose $f$ is a 2-coloring of $\{1, 2, . . . , e(2n)\}$. We define a sequence of pairs of finite sets $p_i$ as follows: $p_0=(\emptyset, \{1, 2, 3, . . . , e(2n)\})$, and if $p_i=(\{x_1, . . . , x_i\}, F_i)$ then $p_{i+1}=(\{x_1, . . . , x_i, y\}, G)$, where $y$ is the least element of $F_i$ and $G$ is either the set of points in $F_i$ which get colored $0$ with $y$, or the complement in $F_i$ of that set, whichever's larger. (cont'd) – Noah Schweber Aug 11 '16 at 00:05
  • 2
    Now by choice of $e(2n)$, we can define $p_i$ for every $i\le n$. Let $C$ be the left coordinate of $p_n$, and note by induction that for any distinct $a, b, c\in C$ we have $f(\{a, b\})=f(\{a, c\})$. Let $F(a)=f(\{a, b\})$ for $a\in C$ (note that it doesn't matter what we use for $b$ here). We can now thin $C$ to get a homogeneous set of size $\ge n$, by pigeonhole: either $F^{-1}(0)$ or $F^{-1}(1)$ has size $\ge n$, and each is homogeneous. – Noah Schweber Aug 11 '16 at 00:07
  • @NoahSchweber. In retrospect I don't think that W. Weiss said "long". – DanielWainfleet Aug 17 '18 at 17:30

Without the axiom of infinity, it's very hard (impossible, I think) to prove that the TREE function is well-defined, even though it is a theorem about finite objects. This is taking a fairly liberal view of what "the finite case" means!

Patrick Stevens
  • 34,379
  • 5
  • 38
  • 88

I am not sure if this qualifies, but I just want to share it here in case you're interested. I got it from a blog post by Mr. Greg Muller : The Axiom of Choice is Wrong

Let me write it down here the main idea of the post.

$100$ prisoners are placed in a line, facing forward so they can see everyone in front of them in line. The warden will place either a black or white hat on each prisoner’s head, and then starting from the back of the line, he will ask each prisoner what the color of his own hat is (ie, he first asks the person who can see all other prisoners). Any prisoner who is correct may go free. Every prisoner can hear everyone else’s guesses and whether or not they were right. If all the prisoners can agree on a strategy beforehand, what is the best strategy?

There is a strategy that ensures at least $99$ prisoners to go free. Let's consider a variance to that problem:

A countable infinite number of prisoners are placed on the natural numbers, facing in the positive direction (ie, everyone can see an infinite number of prisoners). Hats will be placed and each prisoner will be asked what his hat color is. However, to complicate things, prisoners cannot hear previous guesses or whether they were correct. In this new situation, what is the best strategy?

This sounds hopeless, right? However, there is a strategy that guarantee all but finitely many prisoners to go free! The paradoxical result stems from the fact that there're non-measurable sets (but some would also attribute it to the sigma-additive nature of probability).

Follows the link if you are interested. In the comment section there are also explanation/discussion of this perplexing phenomenon by many mathematician, Prof. Tao being one of them.

  • 13,919
  • 3
  • 26
  • 70
  • Sorry for being rude, but that article is bs. The author only says that there is this magical solution but doesn't tell what the solution really is. And to put it simply: if the prisoners cannot hear each other and cannot know if the previous ones were correct, then, for any natural numbers `n` and `m`, the `n`-th prisoner is precisely in the exact same situation as the `m`-th prisoner, which is: I see an infinite number of hats and have absolutely no other information whatsoever. Therefore, it naturally follows that is it impossible that after a finite number of prisoners, **all** will guess. – Bogdan Alexandru Aug 11 '16 at 11:15
  • 2
    @BogdanAlexandru Well, actually you must have missed something, because the article specifically says precisely what the strategy is... I suggest that you re-read it again, more carefully this time. You can come back here once you find it and if you still want to discuss about it. – BigbearZzz Aug 11 '16 at 13:52
  • 1
    @BogdanAlexandru By the way, you need to know what an equivalent class is and what the axiom of choice do in order to understand the solution to the problem. Yes, the solution is indeed paradoxical (or even problematic) but not for the reason you had given. – BigbearZzz Aug 11 '16 at 13:54
  • 1
    I did. So, aren't there an infinite number of equivalence classes then? Would a prisoner have to memorize an infinity of infinite sequences? And even if we go over this logistics problem, then how do you solve the paradox posed by my question above? My argument is still valid: for every `m` and `n`, there is absolutely no difference between the `m`-th and the `n`-th prisoner => they have the exact chance of guessing ==> they all have the exact same chance => they all have 50% chance (becaue the first one has 50%). – Bogdan Alexandru Aug 11 '16 at 20:37
  • "Would a prisoner have to memorize an infinity of infinite sequences?" Yes, but do note that mathematically speaking, the difficulty of memorizing infinitely many infinite sequences and memorizing just one such sequence is equal. To address your second point, indeed there is no different between and two prisoners if they guess randomly. However, the main idea of the strategy is that you __don't__ guess randomly (that's why we call it _a strategy_ in the first place). – BigbearZzz Aug 11 '16 at 20:52
  • Ok, I understand now that they aren't really guessing in the traditional sense. So what happens is this: I see an infinite sequence S(n), S(n+1), S(n+2)... Then I check will all my memorized sequences and I determine that this sequence is in class C, which contains all sequences that eventually converge to the sequence I see. At this point, I simply look at the sequence we have all chosen in class C, let's call that A, with the property A(k)=S(k) for k>=n. And now I look at A(n-1). That is my answer, correct? – Bogdan Alexandru Aug 11 '16 at 21:06
  • Yeah, that's very close, except some few points. Firstly, you don't know what the $n$ actually is, you only know that it exists. This is why you can't simply just look at $A(n-1)$. – BigbearZzz Aug 11 '16 at 21:12
  • One last comment: I still don't find a satisfactory counter-argument to this: a choice has been made when we all chose a sequence from the equivalence class I have just identified. However, whether or not the sequence I am currently in (and which is equivalent to that choice) matches the value at my current position in the sequence is at chance: 50% chance actually, determined by the choice we've made. So if all prisoners are equivalent and they all apply this logic, it would mean that each of them is still under the incidence of the 50% chance of having commonly chosen a sequence [...] – Bogdan Alexandru Aug 11 '16 at 21:18
  • [...] that matches the value at the current position with the sequence that they are actually in. – Bogdan Alexandru Aug 11 '16 at 21:19
  • And I keep wondering where did you get your figure $50\%$ from. When the result is either true or false, it does not follow that both result are equally likely. If you throw a die, you either get a $3$ or you don't but you don't believe that the chances are $50-50$ right? – BigbearZzz Aug 11 '16 at 21:27
  • My logic is this (giving an example here): if there are an infinity of sequences that eventually become all zeros, i.e. they turn into [,,,]`x`1000000..., then `x` can be 0 or 1 with 50%. – Bogdan Alexandru Aug 11 '16 at 21:58
  • @BogdanAlexandru You are basically assuming that of all the sequences possible, there is an agent (maybe he's God or some kind of an alien or a judge) who randomly select one sequence in which you are now standing in. This agent is unbiased in the sense that every sequence is equally likely. There are 2 main problem with this assumption. Firstly, the sequence is not randomly chosen by any agent, it is predetermined so element of randomness does not lies in the sequence, only on the choices of the prisoners. – BigbearZzz Aug 11 '16 at 22:35
  • @BogdanAlexandru (continued) Secondly, even if we allowed that the sequence is to be chosen randomly by a fair agent, there is no uniform distribution on the countably infinite set in which we are dealing with (the sequence). What does that mean? It simply means that the condition of fairness that you want to impose, seemingly very natural and shouldn't be hard to do, is actually impossible to satisfy. Any reasonable distribution will fail to meet your criteria. – BigbearZzz Aug 11 '16 at 22:37
  • (continued) For a simple thought experiment, you are invited to try and write a program or invent an algorithm that would randomly give me a positive integer. The only criterion that I want is that every numbers must be equally likely to be chosen. Think about this for a while an you'll gain some (deep?) insight into the problem and to why you "logic" that actually sounds reasonable actually fails to meet the rigorous standard of mathematics. – BigbearZzz Aug 11 '16 at 22:38
  • I understand that the randomness is on the prisoners choice, i.e., when they select a sequence from each class. But the rest of your argument seems too abstract and I fail to see how my argument gets refuted. Sorry, I'm not a matematician, I'm an engineer :) I guess I will investigate more on this, maybe read some more math, then I can get back to this discussion. Cheers! – Bogdan Alexandru Aug 12 '16 at 06:14
  • I just wanted to add that I kind of "see" where you're going, that is, I intuitively understand how this abstract solution works, but then every time I get back to my argument "they are all equal and independent so regardless of what they do they must have the same probability of matching their response with the actual value", I fail to prove it wrong. – Bogdan Alexandru Aug 12 '16 at 06:17
  • @BogdanAlexandru I'm glad that you're interested in math. I strongly advice that you to try to solve my exercise about randomly chosen an arbitrary integer. I'd really help you to understand all this mess. :) Cheers! – BigbearZzz Aug 12 '16 at 06:54
  • By the way, don't get discouraged if you don't understand this. I specifically chose this result because of its unintuitiveness. The explanation of the phenomenon lies at the bottom of the foundation of probability and, if you read the comments section, you'll find that some mathematicians actually came with explanations of why the result cannot be intuitively understood. – BigbearZzz Aug 12 '16 at 07:00
  • 1
    Regarding that thought experiment, what inputs am I allowed to use? E.g., can I say I randomly generate x in interval (0-1], then I round 1/x to the nearest integer? This solution is not good, I'm only asking for what do I have available. – Bogdan Alexandru Aug 12 '16 at 11:32
  • 1
    @BogdanAlexandru You are allowed to use any thing. I just want that every number is equally likely to be chosen. Your example fails because it gives me number $1$ if $\frac 23< x\le 1$ while the output is $2$ if $\frac 25 – BigbearZzz Aug 12 '16 at 12:43
  • I don't see a solution then... the probability function must be an horizontal line infinitely close to the x axis. It's derivative would be 0 so the function must be constant, but then the integral needs to be 1. So what is the lesson learned from here? – Bogdan Alexandru Aug 16 '16 at 11:20
  • @BogdanAlexandru The lesson from that is it doesn't exist! The implication is that while it is intuitively plausible to assert that the occurrence of any two sequence is equally likely, it actually leads to a contradiction! And that is precisely how your reasonable reasoning cannot be made rigorous, because if you try to do so, it will lead to a contradiction just like that one. – BigbearZzz Aug 16 '16 at 12:18
  • 2
    Cool! This actually gets me thinking. Thanks for the interesting chat and for the patience to prove me wrong on my naive assumption that "this article is bs", which I need to take back now :) – Bogdan Alexandru Aug 16 '16 at 12:26

Every single Machine Learning problem ever.

Optimal statistical inference is generally much easier in the case of infinite data than finite data.

  • 12,494
  • 15
  • 48
  • 93

I know this is trivial, but here's one: $x$ is finite and unknown. Determine whether $x < n$.

Easy when $n$ is infinite!

  • 334
  • 1
  • 3
  • 11

A Markov chain is characterized by a discrete probability distribution of the initial state of a system, 0, and a transition matrix P such that pij is the probability of going from state i to state j after any one transition. Under certain assumptions, the probability distribution of states after the first transition is 1 = 0P, after two transitions is 2 = 1P = 0P2, after n transitions, n = 0Pn.

By constrast, the steady-state (as n -> ∞) probability distribution of states, , must satisfy = P, and = 1, where is a column vector of 1's. This is an over-determined (by one) system of equations, but the upshot is that the steady-state probability distribution of states is obtained by inverting a matrix.

See, for example, the Markov chain Wiki for details.


Stability analysis of finite difference stencils

GKS stability requires that the two semi-infinite problems defined on $[a,\infty]$ and $[-\infty,b]$ must be stable in order for the finite problem defined on $[a,b]$ to be stable. Thus, you can't do the finite case without at least doing the infinite ones first.

  • 2,113
  • 8
  • 22

Burnside problem

The solution for Burnside problem in the following formulation:

For which positive integers $m$, $n$ is the free Burnside group $B(m, n)$ finite?

is not known for any $m$, $n$, but there are some results for infinite cases:

  • $B(m,n)$ is infinite for all odd $n > 665$
  • $B(m,n)$ is infinite for even $n > 2^{48}$ and $n$ divisible by $2^9$
  • $B(m,3)$, $B(m,4)$ and $B(m,6)$ are finite for all $m$

And the example for the question is $B(2,5)$, for which it is not known whether it is finite. $B(2,5)$ also has the smallest $m$ and $n$ for which the problem remains open.

  • 173
  • 1
  • 5

Finding Nash equilibria for $N$ players vs $N \to \infty$ players; see mean field games.

  • 3,887
  • 15
  • 28

The Strategy-stealing argument in game theory would seem to fit here. Thus we know some games (e.g hex) are a first player win, but that knowledge doesn't help determine the necessary strategy.

  • 766
  • 6
  • 12

In the spirit of D Poole's comment on the Zero-One Law, long-term probabilistic behavior is often a lot easier to deal with than short-term. A lot of elementary theorems in probability tell us about the asymptotic behavior of a sequence $(X_k)_{k \in \mathbb{N}}$ of random variables under suitable conditions. For instance, suppose I have a six-sided die, and I want to know the probability that in the first $600$ rolls, $1/3$ of the rolls were $4$. The formula for this value might be quite ugly when we want to look at this event after some specified number of rolls, but the strong law of large numbers tells us the probability of this happening over the long run (i.e. that $1/3$ of the rolls will be $4$) is $0$ (and further we can fairly easily provide more specific descriptions of the "size" of this outcome, e.g. Hausdorff dimension). So in general, though descriptions of the behavior of random variables might be ugly after finite time, we're often able to provide nicer descriptions of the long-term ("infinite time") behavior.

  • 7,909
  • 2
  • 15
  • 39

I don't believe nobody mentioned applications of the Central Limit Theorem (CLT).

For example calculating this: \begin{align} \int^{n+\sqrt[]{n}}_0 \frac{x^{n-1}}{(n-1)!}e^{-t}\,dt \end{align} is very easy when $n\to \infty$ using the CLT.

And yet another application of CLT itself is Wilk's Theorem very strong statement about the infinity case.

  • 8,438
  • 1
  • 11
  • 37