Could you provide a geometric explanation of the Taylor series expansion?

1It would be good if you could be more precise. What kind of geometric explanation? What is not intuitive that you would like cleared up? – Qiaochu Yuan Nov 08 '10 at 13:32

2See the graph @ http://en.wikipedia.org/wiki/Taylor_series. Also see http://en.wikipedia.org/wiki/Taylor_polynomial. Taylor's theorem is derived from the Mean Value Theorem – SandeepJ Nov 08 '10 at 14:21

It does not get much better than this: [3Blue1Brown's Taylor series video](https://www.youtube.com/watch?v=3d6DsjIBzJ4). – durdi Jun 16 '20 at 20:35

One can also get an inutition from the proof: https://math.stackexchange.com/a/492165/445105. We can approximate $f(x)$ by $f(a)$ which results in the error $\int_a^x f'(y)dy$ which we can bound by $\sup_{y\in [a,x]}f'(y) (xa)$. If we instead approximate the $f'(y)$ inside the integral again, we get a better approximation and a $O((xa)^2)$ error for bounded second derivatives because integrating $(xa)$ adds powers. – Felix B. Jul 13 '21 at 11:19
6 Answers
We know that the higher the degree of an equation, the more "turning points" it may have. For example, a parabola has one "turning point."
(A parabola has an equation of the form $y=ax^2 + bx +c$.)
A cubic of the form $y=ax^3 + bx^2 +cx +d$ can have up to two "turning points," though it may have fewer. In general, an equation of degree $n$ may have up to $n1$ turning points.
(Here is the polynomial $f(x) = 2x^4  x^3 3x^2 + 7x  13$. It is degree 4 and it has the maximum number of turning points, 41=3. But, keep in mind, some degree 4 polynomials have only one or two turning points. The degree gives us the MAXIMUM number: $n1$.)
This is important because, if you want to use a polynomial to approximate a function, you will want to use a polynomial of high enough degree to match the "features" of the function. The Taylor series will let you do this with functions that are "infinitely differentiable" since it uses the derivatives of the function to approximate the functions behavior.
Here are Taylor polynomials of increasing degree and the sine curve. Notice how they are "wrapping around" the sine curve, giving an approximation that fits better and better over more of the curve as the degree of the Taylor polynomial increases.
(Source for this image: http://202.38.126.65/navigate/math/history/Mathematicians/Taylor.html)
Since the sine curve has so many turning points it is easy to see that to match all of the features of the sine curve we will need to take the limit of the $n^{th}$ degree Taylor polynomial as $n \rightarrow \infty$.*
That's the intuition behind the Taylor series. The higher the degree, the better the "fit." Why? Because higher degree curve have more "turning points" so they can better match the shape of things like the sine function. (As long as the function we are approximating is differentiable.)
*Side note: A function may have only a few turning points and still need infinitely many terms of the Taylor polynomial. Take the catenary, for example, which only has one turning point since it looks like a parabola. The Taylor series for the catenary will not have any terms where the coefficients are zero, since the derivatives of the catenary are hyperbolic sinusoidal functions.
But, even with the catenary, higher degree polynomials give a better approximation.
 6,107
 2
 37
 58

1

there is only one clear geometric answer for this question ...and this is it...kudo – Sedumjoy Aug 08 '18 at 14:05
I give it a try: If you want to know where you will be in x time driving a car you can find out by separating the different components: position at the moment, speed, acceleration, jolt and so on and add them all together.
 8,348
 10
 47
 74
Think of a Taylor series not as one entity but as a sequence of approximations.
The first term gives a constant approximation: $f(x + h)$ is approximately $f(x).$
The first two terms give a linear approximation: $f(x + h)$ is approximately $f(x)$ plus a trend term, $h f'(x).$
The first three terms include a constant approximation, a linear trend, and a curvature term to account for the change in the linear trend: f(x + h) is approximately $f(x) + h f'(x) + h^2 \dfrac{f''(x)}{2}.$
Next you add a term to account for the change in the curvature, etc.
 6,710
 2
 25
 38
Predict global while computing local!
A Taylor expansion of a function $f$ around some value $x_0$ is similar to a prediction of the function at a neighboring value $x$ knowing progressively more about the variation of $f$ at the point $x_0$.
First step: easiest prediction: nothing changed, that is, $f(x) = f(x_0)$
Second step: we know the first derivative, so we predict the function was linear between $x_0$ and $x$ : $f(x) = f(x_0) + (xx_0)f'(x_0)$. See, everything still local as the derivate is given at $x_0$.
The next step give a generalization of this predictions for higher derivatives. The different forms give bounds to the error or more knowledge of the residual.
 177
 8
Performing an $n$th finite Taylor expansion can be thought of as making the approximation that the function's $n$th derivative is constant.
Try it yourself: let $f$ be a $n$ times differentiable function whose $n$th derivative is constant, and suppose you know the values of $f^{(i)}(0)$, $0\leq i\leq n$. By integrating repeatedly, you'll find that this uniquely determines $f$ and produces the formula for a Taylor's expansion.
Intuitively, I find it plausible that neglecting higher order derivatives of a function shouldn't cause too large of an error. Taylor's theorem confirms this intuition.
 26,283
 6
 57
 113
We are approximating a function by polynomials at a point.As a first approximation, we give a give polynomial whose value at that point is same as the functions. In the second step, we make the first derivative equal too. In the third step,the second derivative is made equal and so on...
 111
 1