146

Suppose we have a set $S$ of real numbers. Show that $$\sum_{s\in S}|s-x| $$ is minimal if $x$ is equal to the median.

This is a sample exam question of one of the exams that I need to take and I don't know how to proceed.

Rodrigo de Azevedo
  • 18,977
  • 5
  • 36
  • 95
hattenn
  • 1,847
  • 3
  • 13
  • 11
  • 33
    Please replace *THE median* by *ANY median*. – Did Feb 25 '12 at 18:47
  • 1
    For completeness, [here](http://gregorygundersen.com/blog/2019/10/04/expectation-median-opt/) is Gregory Gundersen's nice explanation for the continuous case (when $S$ contains an *infinite* number of real numbers). – Richard Hardy Aug 13 '20 at 09:11
  • 4
    To clarify the comment by @Did above, where they say "ANY median" they mean "any point between the two middle elements", when there are an even number of elements in the set S. When S contains an odd number of elements, there is, of course, only one median. – John Red May 19 '21 at 23:55

10 Answers10

117

Introduction: The solution below is essentially the same as the solution given by Brian M. Scott, but it will take a lot longer. You are expected to assume that $S$ is a finite set. with say $k$ elements. Line them up in order, as $s_1<s_2<\cdots <s_k$.

The situation is a little different when $k$ is odd than when $k$ is even. In particular, if $k$ is even there are (depending on the exact definition of median) many medians. We tell the story first for $k$ odd.
Recall that $|x-s_i|$ is the distance between $x$ and $s_i$, so we are trying to minimize the sum of the distances. For example, we have $k$ people who live at various points on the $x$-axis. We want to find the point(s) $x$ such that the sum of the travel distances of the $k$ people to $x$ is a minimum.

The story: Imagine that the $s_i$ are points on the $x$-axis. For clarity, take $k=7$. Start from well to the left of all the $s_i$, and take a tiny step, say of length $\epsilon$, to the right. Then you have gotten $\epsilon$ closer to every one of the $s_i$, so the sum of the distances has decreased by $7\epsilon$.

Keep taking tiny steps to the right, each time getting a decrease of $7\epsilon$. This continues until you hit $s_1$. If you now take a tiny step to the right, then your distance from $s_1$ increases by $\epsilon$, and your distance from each of the remaining $s_i$ decreases by $\epsilon$. What has happened to the sum of the distances? There is a decrease of $6\epsilon$, and an increase of $\epsilon$, for a net decrease of $5\epsilon$ in the sum.

This continues until you hit $s_2$. Now, when you take a tiny step to the right, your distance from each of $s_1$ and $s_2$ increases by $\epsilon$, and your distance from each of the five others decreases by $\epsilon$, for a
net decrease of $3\epsilon$.

This continues until you hit $s_3$. The next tiny step gives an increase of $3\epsilon$, and a decrease of $4\epsilon$, for a net decrease of $\epsilon$.

This continues until you hit $s_4$. The next little step brings a total increase of $4\epsilon$, and a total decrease of $3\epsilon$, for an increase of $\epsilon$. Things get even worse when you travel further to the right. So the minimum sum of distances is reached at $s_4$, the median.

The situation is quite similar if $k$ is even, say $k=6$. As you travel to the right, there is a net decrease at every step, until you hit $s_3$. When you are between $s_3$ and $s_4$, a tiny step of $\epsilon$ increases your distance from each of $s_1$, $s_2$, and $s_3$ by $\epsilon$. But it decreases your distance from each of the three others, for no net gain. Thus any $x$ in the interval from $s_3$ to $s_4$, including the endpoints, minimizes the sum of the distances. In the even case, I prefer to say that any point between the two "middle" points is a median. So the conclusion is that the points that minimize the sum are the medians. But some people prefer to define the median in the even case to be the average of the two "middle" points. Then the median does minimize the sum of the distances, but some other points also do.

Michael Hardy
  • 1
  • 30
  • 276
  • 565
André Nicolas
  • 491,093
  • 43
  • 517
  • 948
97

We're basically after: $$ \arg \min_{x} \sum_{i = 1}^{N} \left| {s}_{i} - x \right| $$

One should notice that $ \frac{\mathrm{d} \left | x \right | }{\mathrm{d} x} = \operatorname{sign} \left( x \right) $ (Being more rigorous would say it is a Sub Gradient of the non smooth $ {L}_{1} $ Norm function).
Hence, deriving the sum above yields $ \sum_{i = 1}^{N} \operatorname{sign} \left( {s}_{i} - x \right) $.
This equals to zero only when the number of positive items equals the number of negative which happens when $ x = \operatorname{median} \left\{ {s}_{1}, {s}_{2}, \cdots, {s}_{N} \right\} $.

Remarks

  1. One should notice that the median of a discrete group is not uniquely defined.
  2. The median is not necessarily an item within the group.
  3. Not every set can bring the Sub Gradient to vanish. Yet employing the Sub Gradient Method is guaranteed to converge to median.
  4. It is not the optimal way to calculate the Median. It is given to give intuition about what's the median.
Royi
  • 7,459
  • 3
  • 40
  • 82
  • 6
    Using derivatives here is overkill; the problem can be done by more elementary methods. $\qquad$ – Michael Hardy Oct 28 '16 at 17:06
  • 31
    @MichaelHardy actually among all answers I find this one to be the simplest, rather than the "walk to the left and then to the right of the real line". – gented Feb 21 '19 at 23:21
  • 1
    To the point. :) – dksahuji Apr 12 '19 at 08:26
  • The derivative of |x| is x/|x| as proven here : https://math.stackexchange.com/questions/83861/finding-the-derivative-of-x-using-the-limit-definition – Clyt Apr 15 '19 at 03:07
  • 4
    @clytondantis, The function $ \frac{x}{ \left| x \right| } $ is one of the definitions of $ \operatorname{sign} \left( \cdot \right) $. – Royi Apr 15 '19 at 04:04
  • How does this proof work when there are an even number of elements and $x$ has to be one of the elements? In this case, the dervative of the objective function set equal to zero cannot be solved, but there would either be more positive elements than negative, or vice versa. – David Jun 22 '20 at 04:02
  • e.g., consider [1,2,3,4]. If we choose the median to be 2, then the number of positive elements is 2 and the number of negative residuals is 1. So the sum of the signs is non-zero. – David Jun 22 '20 at 04:09
  • @David, In the case you raised the median isn't well defined in the set. Then you'll see both the above proof and mine define the solution to be $ \left( 2, 3 \right) $. Yet pay attention to the problem formulation, $ x $ is free to be median, only the $ s $ must be in the given set. – Royi Jun 22 '20 at 05:22
  • (2,3) means not inclusive of 2 and 3 right? – David Jun 22 '20 at 05:29
  • @David, Yes, indeed. But again, the problem, as defined above, doesn't enforce the solution to be one of the items in the set. – Royi Jun 22 '20 at 10:08
  • Right. I understand. I came across a similar problem, except the solution had to be constrained to one of the problems in the set. I tried to adapt your method to it, but I reach a contradiction when I set the derivative to zero. It either ends with 1 = 0 or -1 = 0. – David Jun 22 '20 at 17:23
  • @David, Open a question with your case and I will post a solution (Link it here). – Royi Jun 22 '20 at 17:55
  • Just opened. Here it is: https://math.stackexchange.com/questions/3730510/extension-of-median-minimizing-the-sum-of-absolute-deviations-the-l-1-norm Thanks – David Jun 22 '20 at 18:15
  • How about $s_i \in \{1, 2, 3, 3, 4\}$ ? – Kumar Jul 16 '20 at 15:22
  • @Kumar, What do you mean? – Royi Jul 16 '20 at 17:23
  • Its an example where the sum of the sub-gradients will not be zero. – Kumar Jul 16 '20 at 19:00
  • You're asking the other side of the coin as @David asked. This is the objective function for your set https://i.imgur.com/vjwulkS.png. If you apply Sub Gradient descent using the defined above Sub Gradient it will converge to the Median of the set which is $ 3 $. So what's the problem? – Royi Jul 17 '20 at 04:31
  • Could you please explain how the median of a discrete set of numbers is not unique? For instance in set {1, 2, 3, 4} isn't the mean just 2.5? of can we set any number between 2 and 3 as median? – Fatemeh Asgarinejad Aug 09 '20 at 19:40
  • 1
    @Fatemeh_Asgarinejad, Indeed any number between 2 and 3 minimizes the cost function to the same value. The issue is the cost function being convex yet not strictly convex (Many solution with the same value). – Royi Aug 10 '20 at 04:17
  • thanks for your response @Royi! So, are we saying that any number that for which the cost function yields a similar value is median? So, I mean any float number between 2 and 3 is median. Is that right? – Fatemeh Asgarinejad Aug 10 '20 at 23:58
  • Float number is a term of a representation of number for computers. Any number (Real Number), Real Number in the range $ (2, 3) $ will yield the same objective value for the above. You could have tried calculating it by your self. – Royi Aug 11 '20 at 04:07
43

Suppose that the set $S$ has $n$ elements, $s_1<s_2<\dots<s_n$. If $x<s_1$, then $$f(x)=\sum_{s\in S}|s-x|=\sum_{s\in S}(s-x)=\sum_{k=1}^n(s_k-x)\;.\tag{1}$$ As $x$ increases, each term of $(1)$ decreases until $x$ reaches $s_1$, therefore $f(s_1)<f(x)$ for all $x<s_1$.

Now suppose that $s_k\le x\le x+d\le s_{k+1}$. Then

$$\begin{align*}f(x+d)&=\sum_{i=1}^k\Big(x+d-s_i\Big)+\sum_{i=k+1}^n\Big(s_i-(x+d)\Big)\\ &=dk+\sum_{i=1}^k(x-s_i)-d(n-k)+\sum_{i=k+1}^n(s_i-x)\\ &=d(2k-n)+\sum_{i=1}^k(x-s_i)+\sum_{i=k+1}^n(s_i-x)\\ &=d(2k-n)+f(x)\;, \end{align*}$$

so $f(x+d)-f(x)=d(2k-n)$. This is negative if $2k<n$, zero if $2k=n$, and positive if $2k>n$. Thus, on the interval $[s_k,s_{k+1}]$

$$f(x)\text{ is }\begin{cases} \text{decreasing},&\text{if }2k<n\\ \text{constant},&\text{if }2k=n\\ \text{increasing},&\text{if }2k>n\;. \end{cases}$$

From here it shouldn’t be too hard to show that $f(x)$ is minimal when $x$ is the median of $S$.

pgmank
  • 103
  • 4
Brian M. Scott
  • 588,383
  • 52
  • 703
  • 1,170
16

You want the median of $n$ numbers. Say $x$ is bigger than $12$ of them and smaller than $8$ of them (so $n=20$). If $x$ increases, it's getting closer to $8$ of the numbers and farther from $12$ of them, so the sum of the distances gets greater. And if $x$ decreases, it's getting closer to $12$ of them and farther from $8$ of them, so the sum of the distances gets smaller.

A similar thing happens if $x$ is smaller than more of the $n$ numbers than $x$ is bigger than.

But if $x$ is smaller than $10$ of them and bigger than $10$ of them, then when $x$ moves, it's getting farther from $10$ of them and closer to just as many of them, so the sum of the distances is not changing.

So the sum of the distances is smallest when the number of data points less than $x$ is the same as the number of data points bigger than $x$.

Lord_Farin
  • 17,225
  • 9
  • 47
  • 121
Michael Hardy
  • 1
  • 30
  • 276
  • 565
10

Starting with $$f(x)=\sum_{i=1}^n |s_i-x|$$

Assume we rearranged our terms such that $s_1<s_2<\cdots<s_n$

We first proceed by making the following observation $$\sum_{i=1}^n |s_i-x| = \sum_{i=2}^{n-1} |s_i-x| +(s_n -s_1) \quad \text{when} \quad x \in [s_1,s_n]$$

Now suppose that $n$ is odd, then by applying the above identity repeatedly we get $$f(x)=\sum_{i=1}^n |s_i-x|=|s_{\frac{n+1}2}-x|+(s_n -s_1)+(s_{n-1}-s_2)+\cdots+(s_{\frac{n+3}2}-s_{\frac{n-1}2})$$ or in other words $$f(x)=|s_{\frac{n+1}{2}}-x|+\text{constant}$$

This is just the absolute value function with its vertex being at $(s_{\frac{n+1}{2}},\text{constant})$, the minimum of the absolute value function occurs at its vertex, therefore $s_{\frac{n+1}{2}}$(median) minmizes $f(x)$.

Now suppose $n$ is even, again by using our identity, we have $$f(x)=\sum_{i=1}^n |s_i-x|=|s_{\frac{n}2}-x|+|s_{\frac{n+2}2}-x| + \text{constant}$$

Where the minimum occurs at $f'(x)=0$(or when not defined), therefore by differentiating and setting $f'(x)$ to zero we get $$\dfrac{|s_{\frac{n}{2}}-x|}{s_{\frac{n}{2}}-x}+\dfrac{|s_{\frac{n+2}{2}}-x|}{s_{\frac{n+2}{2}}-x}=0$$

Observe that $s:=\dfrac{s_{\frac{n+2}{2}}+s_{\frac{n}{2}}}{2}$(median) satisfies the above equation, since $s$ is halfway between $s_{\frac{n}{2}}$ and $s_{\frac{n+2}{2}}$ $$s_{\frac{n}{2}}-s=-(s_{\frac{n+2}{2}}-s)$$ that is by setting $x=s$ we get $$\dfrac{|s_{\frac{n}{2}}-s|}{s_{\frac{n}{2}}-s}+\dfrac{|s_{\frac{n}{2}}-s|}{-(s_{\frac{n}{2}}-s)}=0$$

Therefore $s$ is a minimum.

Michael Hardy
  • 1
  • 30
  • 276
  • 565
Omar Nagib
  • 1,198
  • 10
  • 22
8

Consider two real numbers $a<b$. Then the objective becomes

$$dist(a,b) = |x-a|+|x-b|$$

This expression is minimum when $a\leq x \leq b$. It can be proved by calculating the objective on 3 cases ($x<a, a\leq x\leq b, x>b$).

Now consider the general case where $S$ has $n$ elements. Sort them in increasing order as $S_1, S_2, \ldots, S_n$.

Pair the smallest and the largest numbers. As explained above, $dist(S_1,S_n)$ is minimum when $S_1\leq x\leq S_n$. Remove these two elements from the list and continue this procedure until there is at most one element left in the set.

If there is an element $S_i$ left, then $x=S_i$ minimizes $dist(x-S_i)$. It also lies between all the pairs.

In the case of even elements, finally the sequence will be empty. As in the case above, median lies between all the pairs.

foo
  • 81
  • 1
  • 3
5

Consider two $x_i$'s $x_1$ and $x_2$,

For $x_1\leq a\leq x_2$, $\sum_{i=1}^{2}|x_i-a|=|x_1-a|+|x_2-a|=a-x_1+x_2-a=x_2-x_1$

For $a\lt x_1$, $\sum_{i=1}^{2}|x_i-a|=x_1-a+x_2-a=x_1+x_2-2a\gt x_1+x_2-2x_1=x_2-x_1$

For $a\gt x_2$,$\sum_{i=1}^{2}|x_i-a|=-x_1+a-x_2+a=-x_1-x_2+2a\gt -x_1-x_2+2x_2=x_2-x_1$

$\implies$for any two $x_i$'s the sum of the absolute values of the deviations is minimum when $x_1\leq a\leq x_2$ or $a\in[x_1,x_2]$.

When $n$ is odd, $$ \sum_{i=1}^n|x_i-a|=|x_1-a|+|x_2-a|+\cdots+\left|x_{\tfrac{n-1}{2}}-a\right| + \left|x_{\tfrac{n+1}{2}}-a\right|+\left|x_{\tfrac{n+3}{2}}-a|+\cdots+|x_{n-1}-a\right|+|x_n-a| $$ consider the intervals $[x_1,x_n], [x_2,x_{n-1}], [x_3,x_{n-2}], \ldots, \left[x_{\tfrac{n-1}{2}}, x_{\tfrac{n+3}{2}}\right]$. If $a$ is a member of all these intervals. i.e, $\left[x_{\tfrac{n-1}{2}},x_{\tfrac{n+3}{2}}\right],$

using the above theorem, we can say that all the terms in the sum except $\left|x_{\tfrac{n+1}{2}}-a\right|$ are minimized. So $$ \sum_{i=1}^n|x_i-a|=(x_n-x_1)+(x_{n-1}-x_2)+(x_{n-2}-x_3)+\cdots + \left(x_{\tfrac{n+3}{2}}-x_{\tfrac{n-1}{2}}\right) + \left|x_{\tfrac{n+1}{2}}-a\right| = \left|x_{\tfrac{n+1}{2}}-a \right|+\text{costant} $$ Now since the derivative of modulus function is signum function, $f'(a)=\operatorname{sgn}\left(x_{\tfrac{n+1}{2}}-a\right)=0$ for $a=x_{\tfrac{n+1}{2}}=\text{Median}$

$\implies$ When $n$ is odd,the median minimizes the sum of absolute values of the deviations.

When $n$ is even, $$ \sum_{i=1}^n|x_i-a|=|x_1-a|+|x_2-a|+\cdots+|x_{\tfrac{n}{2}}-a|+|x_{\tfrac{n}{2}+1}-a|+\cdots+|x_{n-1}-a|+|x_n-a|\\ $$ If $a$ is a member of all the intervals $[x_1,x_n], [x_2,x_{n-1}], [x_3,x_{n-2}], \ldots, \left[x_{\tfrac{n}{2}},x_{\tfrac{n}{2}+1}\right]$, i.e, $a\in\left[x_{\tfrac{n}{2}},x_{\tfrac{n}{2}+1}\right]$,

$$ \sum_{i=1}^n|x_i-a|=(x_n-x_1)+(x_{n-1}-x_2)+(x_{n-2}-x_3)+\cdots + \left(x_{\tfrac{n}{2}+1}-x_{\tfrac{n}{2}}\right) $$

$\implies$ When $n$ is even, any number in the interval $[x_{\tfrac{n}{2}},x_{\tfrac{n}{2}+1}]$, i.e, including the median, minimizes the sum of absolute values of the deviations. For example consider the series:$2, 4, 5, 10$, median, $M=4.5$.

$$ \sum_{i=1}^4|x_i-M|=2.5+0.5+0.5+5.5=9 $$ If you take any other value in the interval $\left[x_{\tfrac{n}{2}},x_{\tfrac{n}{2} + 1} \right] =[4,5]$, say $4.1$ $$ \sum_{i=1}^4|x_i-4.1|=2.1+0.1+0.9+5.9=9 $$ For any value outside the interval $\left[x_{\tfrac{n}{2}},x_{\tfrac{n}{2}+1}\right]=[4,5]$, say $5.2$ $$ \sum_{i=1}^4|x_i-5.2|=3.2+1.2+0.2+4.8=9.4 $$

Sooraj S
  • 6,813
  • 3
  • 34
  • 68
1

Suppose $S$ is finite (with cardinal $s$), without repetitions, and ordered. Then the sum of absolute values is continuous (sum of continuous functions), and piecewise linear (hence differentiable), with left-most slope $-s$. By induction, the slope increases by 2 for each interval from left to right, with right-most slope $+s$. Hence the piece-wise slope first reaches either $-1$ or $0$ at index $\left\lfloor \frac{s+1}{2}\right\rfloor$, and $0$ or $+1$ at index $\left\lceil \frac{s+1}{2}\right\rceil$.

Hence the function attains its minima in the interval $\left[\left\lfloor \frac{s+1}{2}\right\rfloor, \left\lceil \frac{s+1}{2}\right\rceil\right]$, which reduces to a singleton when $s$ is odd.

The notion of median for continuous functions is detailed in Sunny Garlang Noah, The Median of a Continuous Function, Real Anal. Exchange, 2007

Laurent Duval
  • 6,164
  • 1
  • 18
  • 47
1

Let $S=\{X_1,X_2,\ldots X_n\}$, w.l.o.g. assume $X_1\leq X_2 \leq \ldots \leq X_n$

$\implies D=\sum\limits_{k=1}^{n}|X_k-x|=|X_1-x|+|X_2-x|+\ldots +|X_{n-1}-x|+|X_n-x|$

Let $n \in \mathbb{Z}_{odd}^{+}$, i.e., $n=2m+1$ for some $m \in \mathbb{Z}^{+}$, for $m=0$ it's trivial.

Also, let's notice that $D=|X_{m+1}-x|+\sum\limits_{k=1}^{m}\left(|X_k-x| + |X_{n-k+1}-x|\right)$

Now, let's consider the sum $D_1=|X_1-x|+|X_n-x|$ and consider the following cases:

$(1) \quad x \leq X_1 \implies D_1=X_1+X_n-2x \geq X_n-X_1$

$(2) \quad x \geq X_n \implies D_1=2x-X_1-X_n \geq X_n-X_1$

$(3) \quad X_1 \leq x \leq X_n \implies D_1=X_n-X_1$

$\implies D_1$ is minimum when $X_1\leq x \leq X_n$

Similarly, $D_2=|X_2-x|+|X_{n-1}-x|$ is minimum when $X_2 \leq x \leq X_{n-1}$

$\ldots$

Likewise, $D_m=|X_m-x|+|X_{m+2}-x|$ is minimum when $X_m \leq x \leq X_{m+2}$

Finally, $D_{m+1}=|X_{m+1}-x|$ is minimum when $x=X_{m+1}$

Hence, $D$ is minimum when $\sum\limits_{i=1}^{m+1}D_i$ is minimum, when each $D_i$ has minimum possible value for $i=1,2,\ldots m+1$

$\implies x=X_{m+1}$, i.e., at the median value $D$ is minimum.

Similarly, we can show the same result for $n \in \mathbb{Z_{even}^{+}}$ too.

Sandipan Dey
  • 1,943
  • 8
  • 11
0

As a start, i will define the median of a set with an even cardinality to be one of the two elements in the mid, for example {1,2,3,4} the median is either 2 or 3. and for a set with odd cardinality the median is the middle element.

Suppose that the set has elements, and 1<2<⋯<, we will start by showing that the median gets min sum for sets of cardinality (size) 1,2, and that any problem can be reduced to a set of cardinality 1 or 2.

For set {}, the median is , and the sum is zero.
For set {1,2}, the median is either 1,2, and the sum is |1 - 2| always.
Obvioulsy its easy to see and prove for sets of sizes 1,2 or any other size, that if x is not one of the set elements then the sum bigger than than if x was an element in the set.

So we have proved that the median works for sets of sizes 1,2. Now lets consider set of size 3 where its sorted: {1,2,3}, to get the minimum its easy to see and prove, that we have to pick x such that x is between s1,s3. but for such an x, |s1 - x| + |s3 - x| is always the same and equals |1 - 3| and so x that acheives min for set {1,2,3} is the same for set {s1} which is s1, which is the median.

Its also not hard to show that the same logic applies for set of four elements, the element that achieves min for set of 4 elements is the same element that achieves min for a set of 2 elements which is the median. And using the same logic reduce the problem from a set of 7 elements to set of 5 elements, and from a set of 6 elements to a set of 4 elements,and so on.

ehab
  • 101
  • 2