I'm supposed to calculate the MLE's for $a$ and $b$ from a random sample of $(X_1,...,X_n)$ drawn from a uniform distribution on $[a,b]$. But the likelihood function, $\mathcal{L}(a,b)=\frac{1}{(b-a)^n}$ is constant, how do I find a maximum? Would appreciate tips on how to proceed!

  • 15,215
  • 4
  • 29
  • 66
Spine Feast
  • 4,420
  • 3
  • 35
  • 62

4 Answers4


First, $ a\leq \min(X_1 , \ldots , X_n) $ and $ b\geq \max(X_1 , \ldots , X_n) $

That is because otherwise we wouldn't be able to have the samples $ X_i $ which are less than $ a $ or greater than $ b $ because the distribution is

$$ X_i \sim \operatorname{Unif}(a,b) $$

and the minimum value $ X_i $ can have is $ a $, and the maximum value $ X_i $ can have is $ b $.

The likelihood function is

$$ \mathcal{L}(a,b)= \prod_{i=1}^n f(x_i;a,b) = \prod_{i=1}^n \frac{1}{(b-a)} = \frac{1}{(b-a)^n} $$

Consider the log-likelihood function

$$ \log\mathcal{L}(a,b) = \log{\displaystyle \prod_{i=1}^{n} f(x_i;a,b)} = \displaystyle \log\prod_{i=1}^{n} \frac{1}{(b-a)} = \log{\big((b-a)^{-n}\big)} = -n \cdot \log{(b-a)} $$

Note that we are looking for the arguments $a$ and $b$ that maximizes the likelihood (or the log-likelihood)

Now, to find $ \hat{a}_{MLE} $ and $ \hat{b}_{MLE} $ take the log-likelihood function derivatives with respect to $ a $ and $ b $

$$ \frac{\partial}{\partial a} \log\mathcal{L}(a,b) = \frac{n}{(b-a)} \\ \frac{\partial}{\partial b} \log \mathcal{L}(a,b) = -\frac{n}{(b-a)} $$

We can see that the derivative with respect to $ a $ is monotonically increasing, So we take the largest $ a $ possible which is $$ \hat{a}_{MLE}=\min(X_1 , ... , X_n) $$

We can also see that the derivative with respect to $ b $ is monotonically decreasing, so we take the smallest $ b $ possible which is $$ \hat{b}_{MLE}=\max(X_1 , ... , X_n) $$

Michael Hardy
  • 1
  • 30
  • 276
  • 565
samee conductor
  • 306
  • 2
  • 7
  • 1
    Formatting tip: use \max,\min,\log they give proper spacing and such like this $\log,\min,\max$. – kingW3 Jul 02 '17 at 14:06
  • 4
    Why take logarithms here? And why use derivatives? The function $(a,b)\mapsto \dfrac 1 {(b-a)^n}$ clearly increases as $a$ and $b$ get closer together; therefore the solution is simply to put them as close together as the constraints allow. The constraints are only that $a$ must not exceed the smallest observation and $b$ must not be less than the largest. I think this answer is more complicated than it needs to be. – Michael Hardy Oct 05 '18 at 18:41
  • 2
    Not only complicated but using derivatives here is potentially misleading as the likelihood is not differentiable at $(a,b)=(\min x_i,\max x_i)$. – StubbornAtom May 23 '19 at 19:21

Think about it a bit. If $b$ is less than the maximum of the observations, then the likelihood is $0$. Similarly, if $a$ is greater than the minimum of the observations, then the likelihood is also $0$ (since you have observations lying outside $[a,b]$ which is probability $0$). Then, if you make $b$ bigger than the max or $a$ smaller than the min, the denominator of the likelihood gets bigger (since the difference of $a$ and $b$ clearly gets bigger), so the likelihood is necessarily lower than $b=\max_i X_i$ and $a = \min_i X_i$.

  • 271,033
  • 27
  • 280
  • 538
  • 18,582
  • 1
  • 25
  • 40
  • well I think there is a typo here. should not $b$ be greater than the maximum observations for the likelihood to be 0? Similarly, $a$ should be less than minimum? If $b$ is less than maximum and $a$ is greater than minimum, they might still be in the range. – ARAT Oct 09 '19 at 09:34

The likelihood is simply the probability of observing the data under given parametric assumptions. Here: $P(x\in [a,b])=\frac{1}{b-a} \implies \mathcal{L}(a,b;n)=\frac{\prod\limits_{i=1}^n \mathbf{1}_{[a,b]}(x_i)}{(b-a)^n}$, the key to this is the numerator..most people forget this and then wonder why we don't set $a=b$. Thus, to maximize the likelihood, you need to minimize the value $(b-a)$ subject to having all data contained in $[a,b]$. Thus, you want $a=\min x_i$ and $b=\max x_i$


Hint: Look at the endpoints of your interval for a maximum. For a Uniform-distribution $x$ is only defined for $a<x<b$. Can you take it from here?

Also look here: maximum estimator method more known as MLE of a uniform distribution

Only difference with the link provided is that you are asked to find two MLE's, one for the beginpoint and one for the endpoint of the interval.

  • 3,099
  • 6
  • 42
  • 73
  • Hmm - an estimate is supposed to be a function of my data, (X_1, ..., X_n) - so I could estimate $\hat{a} = \min_{1\le i \le n} X_i$ and $\hat{b}=\max_{1\le i \le n} X_i$ - is this correct? – Spine Feast Jun 04 '13 at 17:07
  • Ah, just looked at the link and it seems to be the same thing, although I'm not familiar with the notion of 'order statistics'. – Spine Feast Jun 04 '13 at 17:07
  • The $kth$-order statistic (http://en.wikipedia.org/wiki/Order_statistic) is just the $kth$-smallest value of the $X_i$'s (from your sample). So for instance $X_1$, the first order statistic is the smallest value of the $X_i$'s. Follow the same approach as in the link that I provided, so first make a likelihood function and then derive that with respect to $a$. See if the function is decreasing or increasing in $a$. Then decide if you hence need the maximum or the minimum of the $X_i$'s (the smallest or largest order statistic). Then do the same for $b$. – dreamer Jun 04 '13 at 17:13
  • Your answer is not yet correct, but you are getting close. If you follow the steps correctly it should be easy to figure out the correct answer. If you need an explicit solution, let me know. – dreamer Jun 04 '13 at 17:15
  • I'm confused now - is likelihood function wrong? I don't understand the thing with order statistics, where the bounds on the likelihood function go. I looked here : http://ocw.mit.edu/courses/mathematics/18-443-statistics-for-applications-fall-2006/lecture-notes/lecture2.pdf and it seems they do take the maximum as their estimator (p. 14) but their likelihood function includes conditions ($L=0 \mbox{ if } \theta \le \max(...)$ and $L=\theta ^{-n} \mbox{ if } \theta \ge \max(...)$) - which I don't know where they came from... – Spine Feast Jun 05 '13 at 08:28
  • And about the bounds with ordered statistics, if you know that it holds for all $X_i$'s that $a\leq x_i \leq b$, you can reformulate this as saying that $x_{n:n}\leq b$ and $x_{1:n}\geq a$ (You can see that this is true in the following way, if the maximum is smaller than a certain value, what does this imply for the other $X_i$'s? Same reasoning goes up for the minimum, if the minimum is greater than a certain value, what does that imply for the other $X_i$'s?). – dreamer Jun 05 '13 at 15:06
  • I don't see how this makes sense intuitively. I've picked $n$ points from some interval and I estimate the left endpoint, $a$, by the LARGEST of the points? – Spine Feast Jun 05 '13 at 15:39