247

In my AI textbook there is this paragraph, without any explanation.

The sigmoid function is defined as follows

$$\sigma (x) = \frac{1}{1+e^{-x}}.$$

This function is easy to differentiate because

$$\frac{d\sigma (x)}{d(x)} = \sigma (x)\cdot (1-\sigma(x)).$$

It has been a long time since I've taken differential equations, so could anyone tell me how they got from the first equation to the second?

tddevlin
  • 165
  • 1
  • 9
Bryan Glazer
  • 2,843
  • 3
  • 14
  • 12
  • 2
    What AI textbook is that? – frog1944 Sep 29 '17 at 11:28
  • 5
    @frog1944: It seems to be *Artificial Intelligence Illuminated* by Ben Coppin, page 302 ([Google Books link](https://books.google.com/books?id=LcOLqodW28EC&pg=PA302&lpg=PA302)). – Hans Lundmark Nov 06 '17 at 11:18
  • 1
    @HansLundmark thank you very much! – frog1944 Nov 06 '17 at 19:06
  • 1
    Any book on neural networks will deal with the sigmoid function. It is useful because of the simple way backpropagation works; a lot of computing work is saved when training a network from a set of results. In nature, other functions are possible, like arctan, rational functions, and more. – richard1941 Jan 18 '18 at 02:02
  • 1
    One of the reasons they use the sigmoid is that it is easy to differentiate and facilitates backpropagation. Not so for other candidates like sign(x), arctangent(x), sinh(x), etc. – richard1941 Dec 21 '18 at 18:22

10 Answers10

403

Let's denote the sigmoid function as $\sigma(x) = \dfrac{1}{1 + e^{-x}}$.

The derivative of the sigmoid is $\dfrac{d}{dx}\sigma(x) = \sigma(x)(1 - \sigma(x))$.

Here's a detailed derivation:

$$ \begin{align} \dfrac{d}{dx} \sigma(x) &= \dfrac{d}{dx} \left[ \dfrac{1}{1 + e^{-x}} \right] \\ &= \dfrac{d}{dx} \left( 1 + \mathrm{e}^{-x} \right)^{-1} \\ &= -(1 + e^{-x})^{-2}(-e^{-x}) \\ &= \dfrac{e^{-x}}{\left(1 + e^{-x}\right)^2} \\ &= \dfrac{1}{1 + e^{-x}\ } \cdot \dfrac{e^{-x}}{1 + e^{-x}} \\ &= \dfrac{1}{1 + e^{-x}\ } \cdot \dfrac{(1 + e^{-x}) - 1}{1 + e^{-x}} \\ &= \dfrac{1}{1 + e^{-x}\ } \cdot \left( \dfrac{1 + e^{-x}}{1 + e^{-x}} - \dfrac{1}{1 + e^{-x}} \right) \\ &= \dfrac{1}{1 + e^{-x}\ } \cdot \left( 1 - \dfrac{1}{1 + e^{-x}} \right) \\ &= \sigma(x) \cdot (1 - \sigma(x)) \end{align} $$

Michael Percy
  • 4,146
  • 2
  • 10
  • 5
117

Consider $$ f(x)=\dfrac{1}{\sigma(x)} = 1+e^{-x} . $$ Then, on the one hand, the chain rule gives $$ f'(x) = \frac{d}{dx} \biggl( \frac{1}{\sigma(x)} \biggr) = -\frac{\sigma'(x)}{\sigma(x)^2} , $$ and on the other hand, $$ f'(x) = \frac{d}{dx} \bigl( 1+e^{-x} \bigr) = -e^{-x} = 1-f(x) = 1 - \frac{1}{\sigma(x)} = \frac{\sigma(x)-1}{\sigma(x)} . $$ Equate the two expressions, and voilà!

(Cf. also this answer.)

Hans Lundmark
  • 48,535
  • 7
  • 82
  • 143
  • How do you derive 1 + e^-x as -e^-x? (Update: I think it's because the derivative of e^x = e^x) https://en.wikipedia.org/wiki/Derivative#Rules_for_basic_functions – Adam Grant Aug 04 '17 at 23:36
  • 2
    @AdamGrant: Yes, since then the chain rule gives $e^{kx}=k e^{kx}$ for any constant $k$. (In this case, we have $k=-1$.) – Hans Lundmark Aug 05 '17 at 06:51
  • Correction to silly typo in the previous comment: it should be $\frac{d}{dx} e^{kx} = k e^{kx}$, of course. – Hans Lundmark Jan 24 '19 at 07:18
22

Note that from your given equation,

$(1+e^{-x})\sigma=1$

$\Rightarrow -e^{-x}\sigma+(1+e^{-x})\frac{d\sigma}{dx}=0$ (differentiating using product rule)

$\Rightarrow \frac{d\sigma}{dx}=\sigma.\frac{e^{-x}}{(1+e^{-x})}=\sigma.\frac{(1+e^{-x})-1}{(1+e^{-x})}=\sigma.\left[1-\frac{1}{(1+e^{-x})}\right]=\sigma.(1-\sigma)$

vonjd
  • 8,348
  • 10
  • 47
  • 74
Tapu
  • 3,446
  • 11
  • 19
8

Since $\sigma(x)$ is a composite function, firstly we need to use chain rule to dig down to the x term, then we can factor back to the $\sigma(x)$ fuction: $$ \begin{align} \frac{d}{dx}\sigma(x) &= (\frac{1}{1+e^{-x}})' \\ &= -\frac{1}{(1+e^{-x})^{2}} \cdot (1) \cdot -e^{-x} \\ &= \frac{e^{-x}}{(1+e^{-x})^{2}}, \\ \because \sigma(x) &= \frac{1}{1+e^{-x}}, \\ e^{-x} &= \frac{1 - \sigma(x)}{\sigma(x)}, \\ 1+e^{-x} &= \frac{1}{\sigma(x)}; \\ \therefore \frac{d}{dx}\sigma(x) &= \frac{\frac{1 - \sigma(x)}{\sigma(x)}}{(\frac{1}{\sigma(x)})^{2}} \\ &= (1 - \sigma(x)) \cdot \sigma(x) \end{align}$$

Leo Mingo
  • 81
  • 1
  • 1
6

Let's say we want to find the derivative of $y=σ(x)=(1+\exp(−x))^{−1}$. So we have:

$$ \begin{align} \frac{dy}{dx} & = (-1)(1 + \exp(-x))^{-2} \frac{d}{dx}(1 + \exp(-x)) \\ \\ & = (-1)(1 + \exp(-x))^{-2}(0 + \frac{d}{dx}\exp(-x)) \\ \\ & = (-1)(1 + \exp(-x))^{-2}(\exp(-x)) \frac{d}{dx}(-x) \\ \\ & = (-1)(1 + \exp(-x))^{-2}(\exp(-x))(-1) \\ \\ & = \frac{\exp(-x)} {(1 + \exp(-x))^2} \\ \\ & = \frac{1 + \exp(-x) -1} {(1 + \exp(-x))^2} \\ \\ & = \frac{1 + \exp(-x)} {(1 + \exp(-x))^2} - \frac{1} {(1 + \exp(-x))^2} \\ \\ & = \sigma(x) - (\sigma(x))^2 \\ \\ & = \sigma(x) \cdot (1 - \sigma(x)) \end{align} $$

Hugh Perkins
  • 668
  • 1
  • 7
  • 14
5

By directly differenting:

$$ \sigma^{'} (x)= \frac{1. e^{-x}}{(1+e^{-x})^2} $$

Separately compute, multiply:

$${\sigma(x)}.{(1-\sigma(x))} =\frac{ e^{-x}}{(1+e^{-x}) } . \frac{ 1}{(1+e^{-x}) }$$

The RHSs agree.

EDIT1:

In general a solution of differential equation

$$ \frac{dy}{dx}=y(1-y) $$

can be seen to be

$$\frac{1}{1+c e^{-x}} \rightarrow \frac{1}{1+ e^{-x}} $$

with center point integration constant evaluated at $x=0, y=\frac12;\, c=1. $

Narasimham
  • 36,354
  • 7
  • 34
  • 88
0

No answer yet involves the logarithm. If $\log$ denotes the natural logarithm then by the chain rule $$\frac{d}{dx} \log(\sigma(x)) = \frac{1}{\sigma(x)}\frac{d \sigma(x)}{dx}. $$ Furthermore $\log(\sigma(x))=\log(e^x) -\log(1+e^x)=x - \log(1+e^x)$ so that $$\frac{d}{dx} \log(\sigma(x)) = 1 - \frac{e^x}{1+e^x} = 1-\sigma(x).$$ Equaling the two displays gives $(d/dx)\sigma(x) = \sigma(x)(1-\sigma(x))$ as desired.

jlewk
  • 1,100
  • 7
  • 12
0

Another approach using the quotient rule is as follows:

let $\sigma=\frac{1}{1+e^-x}$ upon rearranging ,

$\sigma=\frac{e^x}{1+e^x}$

using the quotient rule

$\frac{d}{{dx}}\left( {\frac{{f\left( x \right)}}{{g\left( x \right)}}} \right) = \frac{{\frac{d}{{dx}}f\left( x \right)g\left( x \right) - f\left( x \right)\frac{d}{{dx}}g\left( x \right)}}{{g^2 \left( x \right)}}$,

let $f(x)=e^x$ and $g(x)=1+e^x$,

we get $\frac{d\sigma}{dx}=\frac{(e^x)(1+e^x)-e^xe^x}{(1+e^x)^2}$

upon rearraning, we get :

$\frac{d\sigma}{dx}=\frac{e^x}{1+e^x}\frac{(1+e^x)-e^x}{1+e^x}$

upon further rearrangement:

$\frac{d\sigma}{dx}=\frac{e^x}{1+e^x}(1-\frac{e^x}{1+e^x})$

$\frac{d\sigma}{dx}=\sigma(1-\sigma)$

-1

$\exp(-x) = \frac{1}{\sigma} -1 $ (By definition). Take the derivative of both sides:

$-\exp(-x)= -\frac{\sigma'}{\sigma^2}$

Add the two to get: $0 = \frac{1}{\sigma} -1 -\frac{\sigma'}{\sigma^2}$

and solve for $\sigma'=\sigma(1-\sigma)$ qed

Gary
  • 19,889
  • 3
  • 17
  • 36
-3

Using my HP Prime, I differentiated 1/(1+exp(-x)) to get exp(-x)/(1+exp(-x))^2. Factor out 1/(1+exp(-x), which is sigma(x), and the rest is 1-sigma(x). That is proof by calculator. Beware! That machine can become addictive because of the way it amplifies your capability.

richard1941
  • 769
  • 4
  • 12