Theorem: Suppose that $f : A \to \mathbb{R}$ where $A \subseteq \mathbb{R}$. If $f$ is differentiable at $x \in A$, then $f$ is continuous at $x$.

This theorem is equivalent (by the contrapositive) to the result that if $f$ is not continuous at $x \in A$ then $f$ is not differentiable at $x$.

Why then do authors in almost every analysis book, not take continuity of $f$ as a requirement in the definition of the derivative of $f$ when we (seemingly) end up with equivalent results?

For example I don't see why this wouldn't be a good definition of the derivative of a function

Definition: Let $A \subseteq \mathbb{R}$ and let $f : A \to \mathbb{R}$ be a function continuous at $a$. Let $a \in \operatorname{Int}(A)$. We define the derivative of $f$ at $a$ to be $$f'(a) = \lim_{t \to 0}\frac{f(a+t)-f(a)}{t}$$ provided the limit exists.

I know this is probably a pedagogical issue, but why not take this instead as the definition of the derivative of a function?

  • 11,600
  • 6
  • 43
  • 126
  • 81
    Good question. And why isn't "the sum of the interior angles is $180$ degrees" included in the definition of a triangle, and why isn't "the square of the hypotenuse is equal to the sum of the squares of the legs" part of the definition of a right triangle? – bof Jun 19 '18 at 23:34
  • 12
    @bof Those definitions are clearly more unreasonable then the one suggested here. Definitions are not always optimized. I think it's more convenient to use the standard definition, but I think it is a legitimate question. – Jair Taylor Jun 20 '18 at 05:25
  • 32
    @JairTaylor The only reason the OP has given for replacing the usual definition of differentiability with his definition is that there's a theorem to the effect that differentiability implies continuity. As my counterexamples show, that is not a good reason. If he thinks his proposal is more reasonable than my examples, then he should give some reason for thinking so. My comment was intended to get the OP to give us a better explanation of why he thinks the definition of differentiability should include continuity. – bof Jun 20 '18 at 05:38
  • 8
    @Perturbative Your definition does not acknowledge the possibility that the function may be continuous but not differentiable. It's not enough to suppose the function is continuous, and then define the derivative to be the limit you wrote. What happens if _that_ limit doesn't exist? – Ari Brodsky Jun 20 '18 at 07:31
  • 4
    You can't define the derivative that way, because it doesn't always exist. All differentiable functions are continuous, but all continuous functions are not differentiable. See https://en.wikipedia.org/wiki/Weierstrass_function for a function that is continuous *everywhere* on the real line, but differentiable *nowhere*. The theorem "if $f$ is not continuous at $a$ it is not differentiable at $a$" is true, but too weak to be of much use. – alephzero Jun 20 '18 at 10:18
  • Might help to define the Int (interior) function in the question. – Daniel R. Collins Jun 21 '18 at 13:42
  • That hypothesis is not needed at all since if the derivative exists, it follows as a consequence. – Allawonder Jun 22 '18 at 09:58
  • There exist lots of useful functions which are not differentiable but continuous. – mathreadler Jun 22 '18 at 19:01
  • @alephzero But not all continuous functions are differentiable. – Horus Jul 01 '18 at 01:29
  • Occasionally, the easiest way to prove that a function, $f(x)$ is continuous at $a$ is to show that it is differentiable at $a$. – Steven Alexis Gregory Feb 04 '21 at 02:50

11 Answers11


Because that suggests that there might be functions which are discontinuous at $a$ for which it is still true that the limit$$\lim_{t\to0}\frac{f(a+t)-f(a)}t$$exists. Besides, why add a condition that it always holds?

José Carlos Santos
  • 397,636
  • 215
  • 245
  • 423
  • 25
    This is my favorite answer because it is deeper than just saying "the piece of info is redundant or superfluous" – WetlabStudent Jun 20 '18 at 13:25
  • 2
    @MHH, José Carlos Santos: You know, the thing is, there very well might exist something similar, so this is actually a bad reason. For a scalar function you could define the symmetric limit $\lim_{t\to0}\frac{f(a+t)-f(a-t)}{2t}$ as the derivative and suddenly $f(a)$ doesn't need to exist anymore. – user541686 Jun 23 '18 at 21:10
  • 2
    @Mehrdad but that definition would be a bad definition, for example, it would say, |x| is differentiable at 0. Remember that the def of a limit requires both left and right-hand limits to agree, so Jose's def is the better choice and covers what I think you are getting at. – WetlabStudent Jun 24 '18 at 23:17
  • 2
    @MHH: Personally I wouldn't have a problem with saying the derivative of absolute value at zero is zero :P but if you don't like that, I suppose you could say the derivative is $\lim_{\substack{t\to 0 \\ u\to 0}{}}\frac{f(a+t)-f(a-u)}{t + u}$ whenever it uniquely exists. My point here isn't to debate the merits of the particular definition though -- my point is that it doesn't seem like a completely wild idea to have a definition that doesn't involve the value of the function at that point (even if there's a better definition than the one I'm proposing). – user541686 Jun 24 '18 at 23:27
  • @AmanKushwaha, you are right, I made a mistake: I deleted my previous comment, here the correct one: i think you say this because drawing the graph of the function, it has the same slope in $f(1)$ and $\lim_{x\to1^-}f(x)$. However, by definition $\lim_{t\to 0^-}\frac{f(1+t)-f(1)}{t}=\lim_{t\to0^-}\frac{2-2(e^{-t}-e^{-1})}{t}\approx \lim_{t\to0^-}\frac{2}{t}+2e^{-1}=-\infty$, while $\lim_{t\to0^+}\frac{f(1+t)-f(1)}{t}=2e^{-1}$. Indeed, $C^1(\mathbb{R})\subset C^1(\mathbb{R})$. – ecrin Dec 04 '21 at 15:46

Definitions tend to be minimalistic, in the sense that they don't include unnecessary/redundant information that can be derived as a consequence.

Same reason why, for example, an equilateral triangle is defined as having all sides equal, rather than having all sides and all angles equal.

  • 68,892
  • 6
  • 58
  • 112
  • 10
    To elaborate a bit more on that - every single definition, primitive notion, axiom and theorem, especially combined together carry plenty of resulting properties. If you start listing all those properties you'll never really get to the actual definition and it will obfuscate rather than clarify the definition. If someone decides to include the continuity into the definition they would still require to prove that there are no noncontinuous differentiable functions, so the amount of work to be done remains the same. That's why the definitions are as simple as possible. – Ister Jun 20 '18 at 08:46
  • 48
    To be fair, there *are* examples where many people do this. For example, most people define groups with inverses instead of right-inverses. So we do not seek a logical minimum, but a conceptual minimum. – Phira Jun 20 '18 at 08:50
  • @Phira Good point, though even that is not universal. See for example [this answer](https://math.stackexchange.com/a/65261) which references a number of authors who define a group using the one-sided inverse and identity properties. – dxiv Jun 20 '18 at 15:47
  • 2
    @dxiv If it were universal, I would not know about the example. Also, I saw a horrible, horrible definition of a group in a category theory book that made me not look at category theory for several years. – Phira Jun 20 '18 at 16:22
  • @Phira: You mean a locally small category with one object in which every morphism is an isomorphism? – tomasz Jun 21 '18 at 11:31
  • This answer might be improved by explaining *why* definitions tend to be minimalistic. (I believe the answer is because they didn't use to be, but mathematicians refine concepts, and one way they do so is make it more minimalistic). – Yakk Jun 21 '18 at 17:43
  • 2
    @Yakk Minimalism of definition is well-advised in general, but it's ultimately a matter of style, preferences and even convenience. It sounds more "natural" for example to define a rectangle as a quadrilateral with all angles being right angles, rather than a quadrilateral with *three* right angles, though the definitions are of course equivalent in the euclidean plane. See [Superfluous definitions](https://mathoverflow.net/questions/54933/superfluous-definitions) for more of the same. – dxiv Jun 22 '18 at 00:58
  • 4
    @Yakk Also, it's not anything recent. Quoting from Kant's 18th century [Critique of Pure Reason](https://www.gutenberg.org/files/4280/4280-h/4280-h.htm): "*the common definition of a circle - that it is a curved line, every point in which is equally distant from another point called the centre - is faulty, from the fact that the determination indicated by the word curved is superfluous. For there ought to be a particular theorem, which may be easily proved from the definition, to the effect that every line, which has all its points at equal distances from another point, must be a curved line*" – dxiv Jun 22 '18 at 01:02

Because then you would need to check continuity for no good reason every time you want to check for differentiability. Besides, it gives the wrong impression of being necessary to include.

  • 20,390
  • 2
  • 55
  • 105
  • 10
    In the alternative pedagogy, you would subsequently prove the theorem "if the limit exists, then the function is differentiable". –  Jun 20 '18 at 03:56
  • 1
    @Hurkyl In fact, I think that in itself might be an answer. Imagine ourselves in our alternate universe where we have included continuity in the definition of differentiability, plus proved a theorem that where the limit exists the function is continuous. Then we ask the analogous question: why did we include continuity in the definition, when we can prove it's superfluous? Seems tougher to motivate than the current status quo to me -- but then maybe that's a failure of my imagination. – Daniel Wagner Jun 23 '18 at 01:22

One possible reason is that the relationship between differentiability and continuity is more subtle in multivariable calculus.

Consider these definitions:

Let $A \subseteq \mathbb{R}^2$ and $f : A \to \mathbb{R}$ be a function. Let $a \in \operatorname{Int} A$. We define the directional derivative of $f$ at $a$ along the unit vector $v \in \mathbb{R}^2$ like this:

$$\partial_vf(a) = \lim_{h \to 0} \frac{f(a+hv) - f(a)}{h}$$

Futhermore, we say that $f$ is differentiable at $a$ if there exists a linear map $L :\mathbb{R}^2 \to \mathbb{R}$ such that

$$\lim_{h\to 0} \frac{\left|f(a+h) - f(a) - Lh\right|}{\|h\|} = 0$$ Note that $h \in \mathbb{R}^2$ here.

It can be shown that if $f$ is differentiable at $a$ then $L$ is unique and the directional derivatives exists along any unit vector $v \in \mathbb{R}^2$, being equal to $\partial_vf(a) = Lv$. Also, it implies that $f$ is continuous at $a$.

However, the converse is not true: if $f$ poseses directional derivatives along all unit vectors, $f$ does not even need to be continuous at $a$ (let alone differentiable):

Consider $f : \mathbb{R}^2 \to \mathbb{R}$ given by

$$f(x,y) = \begin{cases} 1, & \text{if $0 < y < x^2$} \\ 0, & \text{otherwise} \end{cases}$$

All directional derivatives at $(0,0)$ exist and are equal to $0$, but the function fails to be continuous at $(0,0)$.

  • 43,866
  • 7
  • 35
  • 70
  • Good example, but wouldn't this actually be somewhat of a motivation _for_ always requiring continuity, so that directional differentiability would be more consistent with differentiability? – leftaroundabout Jun 21 '18 at 10:38
  • 2
    @leftaroundabout I just meant that it makes sense to consider directional derivatives at $a$ even if the function is not continuous at $a$. We need to have a stronger notion of differentiability for it to imply continuity. – mechanodroid Jun 21 '18 at 10:52

Another reason is that, later on when you want to generalize, you might have to explicitly remove that requirement... why risk that when you never needed to have it in the first place?

(Example: the derivative of the unit step is the Dirac delta, neither of which is continuous at zero.)

  • 12,494
  • 15
  • 48
  • 93

With the standard definitions, it is an important theorem that differentiable implies continuous. As a theorem, it has real content: if a function can be locally approximated (in a way that can be made precise) at a point by a linear function then it is continuous at that point. With your proposed definition, differentiable implies continuous is a boring tautology. Of course, you could still formulate the non-boring version of the theorem so that it uses the revised terminology, but the result would be somewhat cumbersome. The standard definition allows you to elegantly phrase an important theorem. Your proposed revision obscures that elegance for no real gain.

John Coleman
  • 5,211
  • 2
  • 16
  • 26
  • I like this answer and have upvoted it. However, I feel like this answer could've been made verbatim had our standard definition of differentiability included the continuity requirement, and the OP's question instead asked why we don't just define differentiability without requiring continuity. That is, I think the appeal to "elegance" works because we are used to the way things are defined. – Theoretical Economist Jun 21 '18 at 00:14
  • @TheoreticalEconomist I see what you are saying but don't think that it is as symmetric as you suggest. I can't think of a natural theorem whose *statement* would be more concise under the proposed definition. On the other hand the *proof* of some theorems would be shorter. For example, the proof of the product rule typically involves an explicit appeal to differentiable implies continuous, so that appeal could simply be dropped. I don't think that the (marginally) shorter proofs offset the loss of expressive concision, but I concede that I might feel differently if I was trained differently. – John Coleman Jun 21 '18 at 11:12

I imagine that a large part of this is just tradition: that's how someone in the past wrote the definition, and so people continue to write it that way.

I imagine the rest boils down to the issue of teaching introductory calculus to a class of mathematically unsophisticated students. First teaching them that a continuity condition is part of the definition of "derivative" and later teaching them that the limit existing is sufficient on its own is going to lead to students getting confused and frustrated. Two specific negative things I imagine this will introduce are:

  • It's already difficult to get students to pay attention to all of the hypotheses of a theorem or definition. Giving them an example of "here's a hypothesis... whoops it doesn't matter" in the very basics is likely to just reinforce that difficulty
  • Some students will latch tightly around "the continuity condition is part of the definition", and will continue to do a lot of wasted work checking the condition, even after you teach them the limit existing is sufficient. They will also be suspicious of work by others (which includes the teacher and the textbook!) that doesn't check this condition every time. Or they will become disillusioned and turn into another example of the previous bullet point.
  • Not really.The definition are build for the purpose of the mathematical theories, not the training itself. There are plenty of definitions that will never be taught anywhere (unless someone studies the specific field themselves) but follow the same principle of keeping it the minimum set of conditions. – Ister Jun 20 '18 at 09:46
  • 3
    Picking which theorems of the theory should be called "definitions" when writing an expository textbook is *entirely* about pedagogy. And, in fact, some definition styles are pointedly *not* minimal ; for example, one popular style of definition is to first prove a theorem: "The following conditions are logically equivalent" and then to define some object to be something "satisfying any (and thus all) of the conditions of the previous theorem". –  Jun 20 '18 at 13:04
  • @Hurkyl I would say the style you describe *is* minimal, in the sense of giving the weakest possible constraint. Adding alternative conditions is going in the opposite direction to adding additional requirements. – Especially Lime Jun 21 '18 at 09:32

To add another perspective: this rewording of the definition has a kind of circularity to it. We know that differentiable functions must be continuous, so we define the derivative to only be in terms of continuous functions. But then, the fact that differentiable functions are continuous is by definition, while it is being used to justify that very definition. The only reason we were able to start this cycle of reasoning is because we know that, using the standard definition without the continuity requirement, we can prove that differentiable functions are continuous. Thus, we should stick with that definition to avoid such a situation.

Alex Jones
  • 7,661
  • 11
  • 28
  • 2
    Using the alternative definition if you had a universe that consisted of nothing but a single isolated infinitesimal point and wanted to do differential calculus "by the book" there you would first have to ponder the philosophical question of whether that point did or did not contain the "nature of continuousness." With the original definition I believe you could at least do a (pretty uninteresting) variant of differential calculus via a cookbook set of steps without having to ponder philosophy. – Bitrex Jun 22 '18 at 17:00

Beyond pedagogy, there is a reason to avoid this restriction. It turns out that it's not strictly required and there are definitions of the derivative that can be computed on non-continuous functions. Instead of taking the definition of the derivative to be the difference equation you provided, if you instead use the fundamental theorem of calculus, you can get an alternate definition of the derivative. Specifically, if $F(x) = \int_{-\infty}^{x}f(x)dx$, then $f(x)$ is the derivative of $F(x)$, and does not require strict continuity. This definition amounts to "The stuff that when integrated yields big F".

There are alternative version of the derivative that handle more degenerate cases: e.g. http://mathworld.wolfram.com/Radon-NikodymDerivative.html

James S.
  • 208
  • 1
  • 5
  • 3
    If you were to define the derivative as you suggest, as anti-integration, then it would not be uniquely defined. If, for example, $F$ happens to have a continuous derivative $f$, then any function $g$ obtained by altering the value of $f$ at one point would also be "the derivative" of $F$, because the alteration wouldn't affect the integral. – Andreas Blass Jun 23 '18 at 23:39
  • Yes, this is correct. Using a more general definition will leave open more degenerate cases. The key observation here is that continuity is not strictly required. Making it a requirement is unnecessarily restrictive and eliminates some cases that are perfectly fine (e.g. the derivative of a smooth function with one point removed). – James S. Jun 25 '18 at 21:32

If we started to include consequences of definitions, in the definitions themselves, there's no telling where to stop.

Jonathan Hole
  • 2,948
  • 1
  • 5
  • 21

The "obvious" (in Cauchy's or Riemann's view) properties of differentiation or continuity lead to the current definitions. The epsilon-delta definition of continuity seems reasonable; it's a bit difficult to find something else. This definition equates to "can draw without lifting the pencil."

The idea of differentiable has to do with slopes. One checks (f(x)-f(x-h))/h for existence and other properties. Then the result follows.

  • 326
  • 4
  • 8