Consider the following experiment. I roll a die repeatedly until the die returns 6, then I count the number of times 3 appeared in the random variable $X$. What is $E[X]$?

Thoughts: I expect to roll the die 6 times before 6 appears (this part is geometric), and on the preceding 5 rolls each roll has a $1/5$ chance of returning a 3. Treating this as binomial, I therefore expect to count 3 once, so $E[X]=1$.

Problem: Don't know how to model this problem mathematically. Hints would be appreciated.

  • 1,239
  • 1
  • 10
  • 14
  • 13
    You got a 6 at throw number _n_; this isn't a 3. And on the $n-1$ previous throws, you didn't get a 6 so the probability of a 3 on each of these throws is $1/5$. – Bernard Masse Sep 06 '15 at 18:55
  • 25
    Anytime you roll something other than a 3 or a 6, you can pretend it never happened. Thus, you can treat it like a coin-flip, and then you're asking "If I flip a coin until I get heads, what's the expected number of tails?" which is one less than [the expected number of flips to get a head](http://math.stackexchange.com/questions/1196452). – BlueRaja - Danny Pflughoeft Sep 06 '15 at 21:41
  • 6
    Until now I didn't know that dice is plural form. Thanks. – Kyslik Sep 06 '15 at 22:21
  • 3
    Why do you expect to roll the die six times before a six comes up? – Bob Jarvis - Слава Україні Sep 08 '15 at 02:41
  • Maybe the way to examine this is to consider how we would programmatically roll a dice, then basically replay that function for n number of times which is our 6 then count the number of of 3's in the resulting array of results ... forgive me i don't speak math (hense why this is not an answer) but programmatially you might say something like ... x = f(n, 3) { for(i = 0; i r == 3).count() – War Sep 09 '15 at 09:06
  • It seems paradoxical, to expect to have thrown one 3 before your first 6, and by the same logic also to expect to have thrown one 6 before your first 3. – Neil W Sep 09 '15 at 14:43
  • 1
    Here is how to resolve the paradox, @NeilW. When 3 appears before 6, we expect 3 to appear *twice* (once required, and once randomly). When 3 appears after 6, the expectation is obviously 0. Average them together to get 1. – pre-kidney Sep 09 '15 at 20:12

14 Answers14


We can restrict ourselves to dice throws with outcomes $3$ and $6$. Among these throws, both outcomes are equally likely. This means that the index $Y$ of the first $6$ is geometrically distributed with parameter $\frac12$, hence $\mathbb{E}(Y)=2$. The number of $3$s occuring before the first $6$ equals $Y-1$ and has expected value $1$.

  • 15,497
  • 2
  • 34
  • 61
  • 11
    I think this is a very clean approach: this is similar to the question about the order of the different aces in a deck of cards, where all the non-aces can be removed without changing anything about the distribution. – Ian Sep 06 '15 at 19:04
  • 1
    Is there no multiplier involved here? I thought the expected number of 3s would be a fifth of the expected time until first six, minus 1. A thought experiment of "time to first 1 in a million event" - the expected value is not 2! – Alec Teal Sep 06 '15 at 22:50
  • 1
    So for a $\tfrac{1}{n}$ geometric distribution, expectation is $n$, less one is $n-1$ and $\frac{n-1}{n-1}$ is $1$ - ignore my previous comment, leaving it there incase anyone else wonders. – Alec Teal Sep 06 '15 at 22:53
  • 4
    If this was a university test question, I wouldn't accept "we can restrict ourselves to dice throws with outcomes 3 and 6" without proof [unless we'd covered the proof during classes]. It's not obvious why the result should be the same for a 13 side die as a 6 sided die. – Zero Sep 09 '15 at 05:24
  • 3
    Another version of this problem: every family wants to have a daughter. They continue having children until they have their first daughter and stop. How many sons does the average family have? – Joel Sep 09 '15 at 09:12

There are infinite ways to solve this problem, here is another solution I like.

Let $A = \{\text{first roll is }6\}$, $B = \{\text{first roll is }3\}$, $C = \{\text{first roll is neither }3\text{ nor }6\}$. Then $$ E[X] = E[X|A]P(A) + E[X|B] P(B) + E[X|C] P(C) = 0 + (E[X] + 1) \frac16 + E[X]\frac46, $$ whence $E[X] = 1$.

  • 23,091
  • 2
  • 27
  • 65
  • 4
    +1. A closely related "recursion" trick is very useful when dealing with Markov chains in discrete space and discrete time. So it is worth paying attention to it. – Ian Sep 06 '15 at 19:07
  • 1
    I'm trying to understand this solution, but I don't follow why $E[X|B]=(E[X]+1)$. Is it because rolling one 3 supplies an "extra" 1, and has no impact on how many 3s are rolled thereafter? – nettle Sep 06 '15 at 19:38
  • 2
    @user5294782, if the first roll is 3, then $$\text{the number of threes} = 1 + \text{number of threes in the following rolls until we get }6.$$ The latter quantity has the same distribution as $X$. For a similar reason, $E[X|C] = E[X]$. – zhoraster Sep 06 '15 at 19:40
  • 2
    @zhoraster I understand now what Ian means by "recursion trick". A very slick solution. – nettle Sep 06 '15 at 19:43
  • I'm also trying to understand this solution. Why is $E[X|A] = 0$? (I assume $P(A) = \frac{1}{6}$). Following the logic of the other parts, I don't see why the following rolls all cannot be 3 just because the first roll isn't 3. – Lawrence Sep 08 '15 at 21:18
  • 3
    @Lawrence $A$ is the situation where 6 is the result of the first roll. Dice rolling stops after the first 6, so zero threes were rolled. Thus, the expectation of the number of threes given that a 6 was rolled first is zero. – Mark H Sep 08 '15 at 21:55
  • Recursive random variables -- neat idea, haven't encountered this before (outside of average-case analysis of recursive algorithms). – Raphael Sep 09 '15 at 09:47
  • @MarkH Thanks, I overlooked that in the question. – Lawrence Sep 09 '15 at 12:25
  • 1
    Since this question seems to be rather popular, I'll elaborate on my previous comment. In a Markov chain, one can often condition on the first step of the chain to derive a recurrence relation for a certain quantity. For example, consider the random walk on $\mathbb{Z}$. Here we move right with probability $p$ and left with probability $1-p$, and we start at $0$. Consider the expected time to hit $a$ or $b$, where $a<0$ and $b>0$. To compute this we can start the process instead at some $x$ and condition on the first step. – Ian Sep 09 '15 at 13:44
  • (Cont.) Calling the expected time starting at $x$ $u(x)$, we have $u(x)=p u(x+1) + (1-p) u(x-1)$. Then $u(a)=u(b)=0$, so we can solve the recurrence, and then solve the original problem by plugging in zero. This is similar to the argument in the answer above, except that the first step of the process *does* change the situation, and so instead of getting a representation of $u(x)$ in terms of $u(x)$, we get a representation in terms of $u(x+1)$ and $u(x-1)$. – Ian Sep 09 '15 at 13:46

Set $X_i$ to be the indicator random variable that you count a 3 on roll $i$, which in particular implies that you didn't stop before roll $i$. To do this, you need a non-six on $i-1$ rolls, and a three on the last roll. Hence $$E(X_i)=\left(\frac{5}{6}\right)^{i-1}\left(\frac{1}{6}\right)$$ $\sum X_i$ is the total number of threes you roll. By the linearity of expectation, $$E\left(\sum X_i\right)=\sum E(X_i)=\frac{1}{6}\sum_{i\ge 1} \left(\frac{5}{6}\right)^{i-1}=\frac{1}{6}\sum_{j\ge 0} \left(\frac{5}{6}\right)^{j}=\frac{1}{6}\frac{1}{1-\frac{5}{6}}=1$$

  • 81,238
  • 9
  • 112
  • 215

It turns out to be simple. Roll the die until you get a first $3$ or $6$. Then if you get a $6$, half the time, you stop with a value of $0$. Otherwise, you repeat, and the expected number of threes given the first three is $1+E(X)$. So the expected value of $X$ is:

$$E(X)=\frac{1}{2}\cdot 0 + \frac{1}{2}(1+E(X))= \frac{1}{2} + \frac{1}{2}E(X)$$

So $E(X)=1$.

Thomas Andrews
  • 164,948
  • 17
  • 196
  • 376
  • 4
    Alternatively, you could just say that $E(X)=\frac{1}{6}(1+E(X))+\frac{2}{3}E(X)$ to reach the same result. However, it should be noted that for calculation like this to be valid, you technically need to establish that $E(X)$ exists. Anyway, this is my favourite method of solving problems like this. – tomasz Sep 07 '15 at 06:57
  • The partition on having rolled a $3$ before a $6$ or vice versa can also be used to calculate $\mathbb{P}[X=n]$ recursively. The base case is $\mathbb{P}[X=0]=\mathbb{P}[\text{A 6 was rolled before a 3}]=\frac{1}{2}$. Then for $n\in\mathbb{Z}_{\geq1}$ we have \begin{align} \mathbb{P}[X=n]&=\mathbb{P}[X=n\text{ and a 3 was rolled before a 6}]\\ &=\mathbb{P}[\text{A 3 was rolled before a 6}]\cdot\mathbb{P}[X=n\,|\,\text{A 3 was rolled before a 6}]\\ &=\frac{1}{2}\mathbb{P}[X=n-1]\\ &=\cdots\\ &=\left(\frac{1}{2}\right)^n\mathbb{P}[X=0]\\ &=\left(\frac{1}{2}\right)^{n+1} \end{align} – Guest Nov 18 '19 at 00:47

Let $Y$ be the number of times the dice is rolled. Then $Y$ follows a geometric distribution with parameter $\frac16$.

We have $\mathbb{E}(Y) = 1 + 5 \mathbb{E}(X)$ since for the first $Y-1$ rolls each of the outcomes $1$, $2$, $3$, $4$, $5$ has equal probability.

It follows (using that a geometric distribution with parameter $p$ has mean $\frac{1}{p}$) that $$ 1 + 5 \mathbb{E}(X) = \mathbb{E}(Y) = 6 $$ so $\mathbb{E}(X) = 1$.

  • 15,497
  • 2
  • 34
  • 61
  • 1
    Is there a Wald's identity or something like this hidden in this argument? – Ian Sep 06 '15 at 19:00
  • This solution is my favourite among the bunch. My (non-rigorous) idea was that you get each possible number at a 'rate' (a continuous one, of sorts) of 1/6 per roll, so that by the time you rolled enough to accumulate a '6 expectation' of 1, your expectation for 3 and for every other number has also risen to 1 (this also implying that, e.g., six rolls are required in expectation to reach this point) -- but I couldn't figure out the right words to formalise it. Now I'm happy this intuition need not remain on the cutting room floor. – Vandermonde Sep 07 '15 at 02:46
  • @Ian, see my solution. – pre-kidney Sep 09 '15 at 20:08

You can apply the total expectation formula. This allows you to simplify the process of computing an expectation by summing over conditional expectations. There are lots of ways to choose how to do the conditioning; here is just one of them. Let $T$ be the number of $3$s counted in the whole process and let $N$ be the number of rolls in the process. Then

$$E[T]=\sum_{n=1}^\infty E[T\mid N=n] P[N=n].$$

The distribution of $N$ is geometric, while $E[T \mid N=n]$ is $n$ times the probability to get a $3$ on any given roll, i.e. $1/6$. So you're left to calculate

$$\sum_{n=1}^\infty \frac{n}{6} \left ( \frac{5}{6} \right )^{n-1} \frac{1}{6}.$$

Moving constants around, this is the same as

$$\frac{1}{36} \sum_{n=1}^\infty n \left ( \frac{5}{6} \right )^{n-1}.$$

This sum is pretty well-known; there is a way to approach it based on interchanging order of summation, and another way to approach based on differentiating the geometric series. The final answer should be $1$.

  • 93,998
  • 3
  • 73
  • 140

Imagine repeating the experiment a huge number of times, to obtain an estimate for the expected value. That estimate is simply the total number of $3$'s rolled divided by the total number of $6$'s rolled. In the limit the estimate equals the expected value. But in the limit, the ratio of $3$'s to $6$'s is obviously $1$.

Barry Cipra
  • 78,116
  • 7
  • 74
  • 151

Let X be the number of throws until you get the first 6. It is easy to calculate that E(X)=6. Thus, if Y denotes the number of throws before the first 6, we have E(Y)=5. Therefore, for W denoting the number of 3's before the first 6 we have E(W)=1 as of these five throws every fifth on the average is a 3.

Joel Adler
  • 665
  • 3
  • 8

Let's denote the expected number of 3's with $x$.

For the first throw, there are three possibilities:

  1. With probability $1/6$, the first throw will be a 6. In that case, you're finished, and the number of 3's is exatly zero.
  2. With probability $1/6$, the first throw will be a 3. In that case, you'll get that 3 and any 3 that you'll throw on subsequent throws. But after that first throw, you're in the same situation as in the beginning, so the expected number of further 3's is $x$. Together with the one 3 you've just got, that's $x+1$ 3's.
  3. With probability $4/6$, the first throw will be neither 3 not 6. Again, afterwards you're in the same situation as before, so the expected number of $3$s in that case is $x$.

So together we get: $$x = \frac16\cdot 0 + \frac16\cdot(x+1) + \frac46\cdot x$$ It is not hard to solve that equation for $x$, to get $$x = 1$$ Therefore the expected number of 3's is $1$.

  • 41,315
  • 8
  • 68
  • 125
  • Is this different from my answer above? More words, I know :) – zhoraster Sep 09 '15 at 14:43
  • @zhoraster: The difference is that you don't need to intensely study it to recognize what it says. ;-) When I wrote my answer, I hadn't recognized yours to say the same. Indeed, on first glance it just looks like a big mess of symbols. Already a few explicit multiplication signs (or even just `\,` for some extra spacing) would have greatly increased the readability of your answer. – celtschk Sep 12 '15 at 08:43
  • There is always the "Edit" button you may press :) But, speaking seriously, you are right, I don't like excessively wordy arguments. A reader should be challenged, otherwise (s)he will get bored. – zhoraster Sep 12 '15 at 08:49
  • Well, that's probably the difference. I don't like the excessively sparse arguments. I think an answer is there to inform the reader, not to challenge the reader. – celtschk Sep 12 '15 at 08:59
  • Are you a teacher? – zhoraster Sep 12 '15 at 09:08
  • @zhoraster: No. Why? – celtschk Sep 12 '15 at 09:28
  • Because this may explain the difference. I *am* a teacher, and my function is to educate, not to *inform* :) I am not saying that this is good or bad, it is just as it is. – zhoraster Sep 12 '15 at 09:41

Counting polynomials are fun.

A roll generates:

$$r = \frac{1}{6} x + \frac{1}{6} y + 4/6 r$$ or $$r = \frac{1}{2} x + \frac{1}{2} y$$

where x is the number of 3s and y is the number of 6s.

$$1 + r + r^2 + r^3 + ...$$ $$=\frac{1}{1-r}$$

We then take the derivative with respect to $x$ to get its count-weighted value, then evaluate at $x=1$:

$$\frac{d}{dx}\frac{1}{1-(\frac{x}{2}+\frac{y}{2})} = \frac{2 (y'(x)+1)}{(x+y-2)^2}$$

Now we want to evaluate at $y=0$ (as any case with a non-zero number of 6s should be discarded):

For $y`(x)$:

$$y = 2r-x$$ $$y'(x) = r'(x) - 1$$ $$r'(x) = 1/2$$ $$y'(x) = 1/2-1 = -1/2$$

So we get:

$$\frac{2 (-\frac{1}{2}+1)}{(1+0-2)^2}$$


Or, the number of 3s is 1 on average.

This strategy can be generalized for more complex situations.

  • 1,332
  • 7
  • 11

Define the random variables $X_i={\bf 1}(\text{roll $i$ is a $3$})$ and the stopping time $T=\{\text{index of first $6$}\}$. Observe that both $X_i$ and $T$ have finite expectation and that the $\{X_i\}$ are independent and identically distributed. Thus by Wald's identity, $$ {\bf E}(X_1+\cdots+X_T)={\bf E}X_1\cdot {\bf E}T. $$ Clearly ${\bf E}X_1=1/6$ and ${\bf E}T=6$. Thus the expected number of threes is $1$.

  • 28,591
  • 32
  • 75

Let's break the problem down into two parts:

  1. If it takes $n$ rolls to get a $6$, how many times is $3$ likely to come up?
  2. How likely is it that it will take $n$ rolls to get a $6$?

For the first question, if a $6$ comes up for the first time on the $n$th roll, then one of the other $5$ numbers came up on each of the $n-1$ previous rolls. On each of those rolls the probability of getting a $3$ is $1/5$, so we would expect that $3$ will occur $\frac{n-1}{5}$ times.

For the second question, the probability that it takes $n$ rolls to get to the first occurrence of a $6$ is given by $(5/6)^{n-1} \cdot (1/6)$.

Now to answer the question: You can think of the expected number of $3$s as the sum of the expressions that answer the first question above, weighted by the probabilities that those situations occur. That is, we need to calculate

$$\frac{0}{5} \cdot (5/6)^0 \cdot (1/6) + \frac{1}{5} \cdot (5/6)^1 \cdot (1/6) + \frac{2}{5} \cdot (5/6)^2 \cdot (1/6) + \frac{3}{5} \cdot (5/6)^3 \cdot (1/6) +\dots $$

We can factor this a bit:

$$\frac{1}{30} \left( \frac{5}{6} + 2\left( \frac{5}{6} \right)^2 +3\left( \frac{5}{6} \right)^3 + 4\left( \frac{5}{6} \right)^4 + \cdots\right) $$

The sum in the parentheses is of the form $\sum_{n=0}^{\infty}na^n $, which converges to $\frac{a}{(1-a)^2}$; in this case we have $a=5/6$ so the sum in parentheses is $\frac{5/6}{(1/6)^2} = 30$, and therefore the whole expression is just $1$.

In fact the naive intuition (it should take about $6$ rolls to get to a $6$, and on each of the previous $5$ rolls there is a $1$ in $5$ chance of getting a $3$, so it should happen once) leads you to the correct answer, but I am not sure if that is just good fortune or if it reflects some deeper truth. I suspect the latter.

  • 21,723
  • 3
  • 44
  • 83

There's a lot of answers here, but I thought I'd throw in an answer for both one and two dice. First let's consider only 3s and 6s. All other throws don't matter.

For one die: p(3|3 or 6) = 1/2 p(6|3 or 6) = 1/2

For two dice, there are exactly two ways to throw a 3 and exactly five ways to throw a 6: p(3|3 or 6) = 2/7 p(6|3 or 6) = 5/7

The probability of rolling n 3s before a 6 is p(3|3 or 6)^n.

Each additional 3 adds 1 to the expected count, leading to two infinite sums:

For one die: E[3 before 6] = 1*(1/2)^1 + 1*(1/2)^2 + 1*(1/2)^3 + ... = 1

For two dice: E[3 before 6] = 1*(2/7)^1 + 1*(2/7)^2 + 1*(2/7)^3 + ... = 2/5 = 0.4

  • 316
  • 1
  • 11

Assume a "fair" 6 sided die.

For N rolls, with a 6 on the Nth, by inspection

         **0 <= 3's thrown <= (N-1)** 

ie all throws before a 6 could gove one of [1 2 4 5] so no 3s,
or all throws before a 6 could produce 3's,
so (N-1) 3s are possible.

On any throw that does not produce a 6, the odds of throwing a 3 are 1 in 5 so you get "1/5th" of a 3 per throw, so the average number of 3's for N throws is [(N-1)/5] even though the actual number will be an integer on every occasion.

Russell McMahon
  • 150
  • 1
  • 6