I'm trying to show that

$$\delta\big(f(x)\big) = \sum_{i}\frac{\delta(x-a_{i})}{\left|{\frac{df}{dx}(a_{i})}\right|}$$

Where $a_{i}$ are the roots of the function $f(x)$. I've tried to proceed by using a dummy function $g(x)$ and carrying out:


Then making the coordinate substitution $u$ = $f(x)$ and integrating over $u$. This seems to be on the right track, but I'm unsure where the absolute value comes in in the denominator, and also why it becomes a sum.

$$\int_{-\infty}^{\infty}\frac{du}{\frac{df}{dx}}\delta(u)g\big(f^{-1}(u)\big) = \frac{g\big(f^{-1}(0)\big)}{\frac{df}{dx}\big(f^{-1}(0)\big)}$$

Can any one shed some light? Wikipedia just states the formula and doesn't actually show where it comes from.

Brian M. Scott
  • 588,383
  • 52
  • 703
  • 1,170
The Wind-Up Bird
  • 701
  • 1
  • 6
  • 8

7 Answers7


Substitute $u=f(x)$, then since $\delta(x)$ is non-vanishing only at $x=0$, we can break up the domain of the integral into small intervals around each root $\alpha_k$ of $f$, where $f$ is monotonic, hence invertible: $$ \begin{align} \int\delta(f(x))\,g(x)\,\mathrm{d}x &=\sum_k\int_{\alpha_k-\epsilon_k}^{\alpha_k+\epsilon_k}\delta(f(x))\,g(x)\,\mathrm{d}x\\ &=\sum_k\int_{f(\alpha_k-\epsilon_k)}^{f(\alpha_k+\epsilon_k)}\delta(u)\,g\!\left(f^{-1}(u)\right)\mathrm{d}f^{-1}(u)\\ &=\sum_k\int_{f(\alpha_k-\epsilon_k)}^{f(\alpha_k+\epsilon_k)}\delta(u)\,\frac{g\!\left(f^{-1}(u)\right)}{f'(f^{-1}(u))}\,\mathrm{d}u\\ &=\sum_k\frac{g(\alpha_k)}{\left|f'(\alpha_k)\right|}\tag{1} \end{align} $$ If $f'(\alpha_k)\lt0$, then $f(\alpha_k+\epsilon_k)\lt f(\alpha_k-\epsilon_k)$ so the limits need to be switched, negating the integral and giving the absolute value.

Equation $(1)$ says that $$ \delta(f(x))=\sum_k\frac{\delta(x-\alpha_k)}{\left|f'(\alpha_k)\right|}\tag{2} $$

  • 326,069
  • 34
  • 421
  • 800

Split the integral into regions around $a_i$, the zeros of $f$ (as integration of a delta function only gives nonzero results in regions where its arg is zero) $$ \int_{-\infty}^{\infty}\delta\big(f(x)\big)g(x)\,\mathrm{d}x = \sum_{i}\int_{a_i-\epsilon}^{a_i+\epsilon}\delta(f(x))g(x)\,\mathrm{d}x $$ write out the Taylor expansion of $f$ for $x$ near some $a_i$ (ie. different for each term in the summation) $$ f(a_i+x) =f(a_i) + f'(a_i)x + \mathcal{O}(x^2) = f'(a_i)x + \mathcal{O}(x^2) $$ Now, for each term, you can show that the following hold: $$ \int_{-\infty}^\infty\delta(kx)g(x)\,\mathrm{d}x = \frac{1}{|k|}g(0) = \int_{-\infty}^\infty\frac{1}{|k|}\delta(x)g(x)\,\mathrm{d}x $$ (making a transformation $y=kx$, and looking at $k<0,k>0$ separately **Note: the trick is in the limits of integration) and $$ \int_{-\infty}^\infty\delta(x+\mathcal{O}(x^2))g(x)\,\mathrm{d}x = g(0) = \int_{-\infty}^\infty\delta(x)g(x)\,\mathrm{d}x $$ (making use of the fact that we can take an interval around 0 as small as we like)

Combine these with shifting to each of the desired roots, and you can obtain the equality you're looking for.

  • 491
  • 2
  • 6
  • 1
    Small note: the $\epsilon$ in the sum could/should depend on $i$ (think eg of $f(x) = x^2 \sin(1/x)$ ) – leonbloy Aug 10 '16 at 14:02
  • 1
    The first order term of your Taylor expansion has an "x" instead of an "(x-a)" type term... Why? You are expanding about f(a), not f(0). Did I overlook something here? – user3728501 May 12 '17 at 20:30
  • It's written as the Taylor expansion of f(a+x) for small parameter x (you get used to seeing things this way after doing years of physics and numerics). If you make the transformation a+x=z, then you can view it as the Taylor expansion of f(z) with small parameter (z-a), giving the terms you're looking for. – KevinG Jul 02 '17 at 14:00
  • @leonbloy you're right, we could (and in fact should!) have $\epsilon_i$ – KevinG Jul 02 '17 at 14:04

Critisism of Previous Answers

My personal feeling is that I am not 100 % happy with any of the answers posted as each of them either does something slightly mathematically dubious or in some cases I may not be familiar with a particular niche area of mathematics and therefore was not able to understand and interpret the answers provided.

I do not wish to make this statement without justifying it, and so I have the following comments in regards to previous answers.

KevinG's answer is the closest to the one I shall present here, however the original question was not answered to completion with rigour. The proof I show here explains the final steps summarized in KevinG's statement "Combine these with shifting to each of the desired roots, and you can obtain the equality you're looking for".

Kurkyl's answer is just a description of where the attempt in the original question "goes wrong".

I do not understand the answer by jbc. This appears to be a branch of mathematics with which I am not familiar.

robjohn's solution requires f(x) be invertable and bijective, which is not true in general.

Finally I personally feel J. Heller's answer is unnecessarily complicated - the use of non-standard or obscure definitions of the delta function may well be correct however is not likely to be clear to the average reader as it was not clear to me.

I hope this feedback is taken positively, it is not meant to be overly critical, however I thought I should justify my reasons for posting a new answer considering the number of answers already available.

As an aside, the context for the solution I will provide comes from a problem in Quantum Field Theory, and so without further ado I personally feel this method is the clearest.

#Prerequisite Proof

We shall require proof that


We proceed as follows; firstly define $y=ax$, hence $\mathrm{d}y=a\;\mathrm{d}x$ we obtain

$$\int^{x=\pm\infty}\delta(ax)\;\mathrm{d}x=\int^{y=a\cdot(\pm\infty)}\delta(y)\;\frac{\mathrm{d}y}{a}= \frac{1}{a}\int^{\pm\infty}\delta(y)\;\mathrm{d}y$$

under the assumption that $a>0$. ($a$ is real and non-negative.) Now consider $a$ negative. We define $b=-a$, hence $b>0$ and the integral above is instead

$$\int^{x=\pm\infty}\delta(-bx)\;\mathrm{d}x=\int^{y=-b\cdot(\pm\infty)}\delta(y)\;\frac{\mathrm{d}y}{-b}=\int^{y=b\cdot(\mp\infty)}\delta(y)\;\frac{\mathrm{d}y}{-b}= \frac{1}{b}\int^{\pm\infty}\delta(y)\;\mathrm{d}y$$

where we have exchanged the limits of integration in the final step. This integral can be written in terms of $a$, $b=\left|a\right|$


hence we have shown that for $a\in\Re$


Origional Question Proof

We wish to prove


where $f^\prime(a_i)=\left[\frac{\mathrm{d}f(x)}{\mathrm{d}x}\right]_{a_i}$, and $a_i$ are the zeros of $f(x)$.

The Taylor expansion of f(x) about the point $a_i$ is


as $f(a_i)=0$.

Inserting the expansion into the integral


To be explicit, introduce the new variable $y=x-a_i$, $\mathrm{d}y=\mathrm{d}x$,


using the expression from the above proof


hence we have obtained the desired result. I personally prefer not to introduce the change of variables from $x$ to $y$, in my original notes I have the expression


this can be proved by extending the "Prerequisite Proof" above.

Answer to question in comments...

It has been 5 years since I posted this question and some years before I have done any particularly serious math.

Question was why are the higher order terms in the Taylor expansion ignored.

For the second order term in $x^2$, we have:



The last line can be proven by substituting variables (twice) $z=y^2$, $y=x-a_i$. Or alternatively you consider that the dirac delta is zero everywhere except at 0, and since we have an integral with a limit which approaches zero from above (positive) and then moves away heading in the positive direction, the integral must cancel itself. (Since there is a square.)

This raises questions about the cubic order term. But the change of variables will always bring in a factor of


where $n$ is the order of the term, so this will always be zero.

(I think)

  • 231
  • 2
  • 7
  • 4
    Note that in my answer, the function only needs to be invertible in a small neighborhood of each root, $\alpha_k$, of $f$. Since it is necessary that $f(x)=0\implies f'(x)\ne0$ for the sum in question to exist, this is a reasonable requirement. – robjohn May 04 '18 at 02:08
  • @Lost: Please specify what function $f$ you are talking about. I am not sure what a "right hand parabola" is. A function does not have "two outputs for one input". – robjohn Aug 15 '21 at 17:04
  • Oh right sorry. I mixed up two things. Anyways, for a function to have an inverse are the above two conditions sufficient and neccesary? Where can I find a proof for it? – Lost Aug 15 '21 at 18:34
  • Why can we ignore the smaller terms in the taylor expansion? – XXX Mar 23 '22 at 20:38
  • 1
    @XXX If $(x-a_i)=0$ then $(x-a_i)^2$ is also $0$. – user3728501 Mar 24 '22 at 11:42
  • I am aware this is not a complete answer – user3728501 Mar 24 '22 at 11:44
  • 1
    @XXX If you check my appended section, although I don't have a huge amount of time to spend on this that should be enough to convince you all higher order terms are 0. – user3728501 Mar 24 '22 at 12:06
  • @XXX use $ symbol to format your tex – user3728501 Mar 24 '22 at 22:05
  • @user3728501 Why is $ \delta\left(f^{\prime}\left(a_{i}\right)\left(x-a_{i}\right)\right) = \delta \left(f^{\prime}\left(a_{i}\right)\left(x-a_{i}\right) +\frac{1}{2}a_{i}\left(x-a_{i}\right)^2+ \ldots\right)$? – XXX Mar 24 '22 at 22:20
  • If you see the final part I added to the question yesterday, this is because all higher order terms are $0$. By the way, your expansion is wrong - there should be a $f^{\prime\prime}$ term in the second order power. – user3728501 Mar 25 '22 at 08:28

In order to give a rigorous proof of this fact, you require a precise definition of the composition of a distribution with a smooth function. This can be found in Schwartz' presentation but requires delicate functional analysis and is not too helpful in computing explicit examples. A simpler and more direct definition was given by Sebastiao e Silva and is as follows. Most distributions $f$ of practical import have finite order, i.e., have the form $D^nF$ where $F$ is continuous and the derivative is in the sense of distributions (for the record, this is also true locally for any distribution). One then defines $f \circ \phi$ to be $$\left(\dfrac 1{\phi'} D\right )^n F\circ \phi.$$ The motivation for this definition and the proof that it is independent of the representation of $f$ (this is the hard part) can be found in the articles of Sebastiao e Silva. A more accessible version is contained in the book "Introduction to the theory of distributions" by Campos Ferreira which is based on lectures of the former.

Using this definition and the fact that the $\delta$-distribution is half of the second derivative of the absolute value function, one can give a rigorous proof of the formula in the query. I can supply the details if so desired.

  • 715
  • 5
  • 7

The big thing you're overlooking is that you got the coordinate substitution $u = f(x)$ wrong. You assumed:

  • $f(x)$ is an invertible function
  • $\lim_{x \to -\infty} f(x) = -\infty$
  • $\lim_{x \to +\infty} f(x) = +\infty$

If these were true, then your calculation is correct -- and the sum has only one term and the thing inside the absolute value is positive. But when these aren't true your calculation is wrong.

Now that you know your mistake, it should be worth trying to the calculation again, being careful to get the substitution right.

If you need more hints,

  • The absolute value will appear when you properly handle the possibility that $f(x)$ is decreasing
  • The sum will appear when you properly handle non-invertible $f(x)$; e.g. by splitting the domain of integration into regions each of which $f(x)$ is invertible.

Normally, action of δ distribution on test function: (δ, φ) = ∫δ(x)φ(x)dx

Now define the action of δ(g(x)): (δ(g(x)), φ) = lim ε→0(δ_ε(g(x)), φ) = lim ε→0 ∫δ_ε(g(x))φ(x)dx. This is just a definition to help us define what its action will be.

This naturally converges to the action of δ distribution. As you probably remember from the action of the normal δ(x), it gives us φ(0) when integrated. So for δ_ε(g(x)) to have the appropriate property, its argument needs to be zero, which is where the roots of g(x) = 0 come in.

In this case, the support of the test function φ needs to be in the neighborhood of these roots: supp[φ(x)] = B_ε(a), where g(x) = 0 when x=a.

So now we're almost ready to completely define the action of δ_ε(g(x)):

(δ(g(x)), φ) = ∫δ_ε(g(x))φ(x)dx, where the region of integration is B_ε(a). Now I want to perform change of variables: y = g(x) near x=a. I only care about near x=a since the integration is only within the neighborhood of x=a

But for this change of variable to make sense, g'(x) cannot be changing signs from positive to negative, or vice versa. Because if it does, x = g_inverse(y) would not be a one to one relationship. --> We need g'(x) ≠ 0. If this is the requirement when we are working in 1D, for multivariate case, the corresponding requirement is |jacobian|x=a ≠ 0. Remember the absolute value is in place since it represents the volume of integration after the change of variable, and volume cannot be negative.

Now we're ready for change of variable:

(δ(g(x)), φ) = ∫δ_ε(g(x))φ(x)dx = (1/|jacobian|x=a)*∫δ(y)φ(a)dy where the region of integration is B_ε(a). Here you can see that this is the action you would expect if

(δ(g(x)), φ) = lim ε→0(δ_ε(g(x)), φ) = (1/|jacobian|x=a)*(δ(x-a), φ)

Surafel W.
  • 21
  • 1

Use the nascent delta function based on the hat function: $$ \delta_h(x) = \begin{cases} 0 & |x|\geq h \\ x/h^2 + 1/h & -h \leq x \leq 0 \\ -x/h^2 + 1/h & 0 \leq x \leq h \end{cases}. $$

Suppose a smooth function $f$ has a simple root at $x=r$. Then $f$ is a bijection in a neighborhood $[a,b]$ of $r$ with an inverse $g$ (I'll be using the term $g$ exclusively for the inverse of $f$ and not for a test function).

Define $$ I_1 = \int_{g(-h)}^{g(0)=r} \delta_h (f(x)) \, dx \quad \text{and} \quad I_2 = \int_{g(0)=r}^{g(h)} \delta_h (f(x)) \, dx. $$

Then $$ I_1 = \frac{F(r) - F(g(-h))}{h^2} + \frac{r - g(-h)}{h}, $$ $$ I_2 = \frac{F(r) - F(g(h))}{h^2} + \frac{g(h) - r}{h}, $$ and $$ I = I_1+I_2 = \frac{g(h) - g(-h)}{h} - \frac{(F\circ g)(h) - 2(F\circ g)(0) + (F\circ g)(-h)}{h^2} \tag{1}\label{1} $$ where $F$ is the antiderivative of $f.$

The limit of the first term in \eqref{1} as $h\rightarrow 0$ is $$ 2g'(0) = \frac{2}{f'(r)} \tag{2}\label{2} $$ (using the identity for the derivative of an inverse function and the definition of a derivative as a limit) and the limit of the second term in $\eqref{1}$ as $h\rightarrow 0$ is $$ (F\circ g)''(0) = F''(r) (g'(0))^2 + F'(r) g''(0) \tag{3}\label{3} $$ (using the definition of derivative again and the chain rule applied twice). Since $F' = f$ and $f(r)=0,$ \eqref{3} can be simplified to $$ (F\circ g)''(0) = f'(r) \Big(g'(0)\Big)^2 = \frac{f'(r)}{\Big(f'(r)\Big)^2} = \frac{1}{f'(r)}. $$ So $$ \lim_{h\rightarrow 0} \int_a^b \delta_h(f(x)) \,dx = \lim_{h\rightarrow 0} \Big(\operatorname{sign}(f'(r)) I\Big) = \frac{1}{ |f'(r)| }. $$ The reason for multiplying $I$ by $\operatorname{sign}(f'(r))$ is that the limits of integration in $I$ need to be switched if $f$ is a decreasing function on $[a,b]$ (that is, if $f'(r)$ is negative).

Integration by substitution is much quicker than using the nascent delta function definition. But you must use the "change of variables" formula that is more general than basic integration by substitution: $$ \int_a^b (\delta\circ f)(x) \,dx = \ \int_{\min(f(a),f(b))}^{\max(f(a),f(b))} \delta(u) |g'(u)| \,du = |g'(0)| = 1/|f'(r)|. $$

J. Heller
  • 1,805
  • 8
  • 17