I was running a procedure to be like one of those games were people try to guess a number between 1 and 100 where there are 100 people guessing.I then averaged how many different guesses there are.

from random import randint

def averager(times):
    tests = list()
    for _ in range(times):
        l = [randint(0,100) for _ in range(100)]
    return sum(tests)/len(tests)


For some reason, the number of different guesses averages out to 63.6

Why is this?Is it due to a flaw in the python random library?

In a scenario where people were guessing a number between 1 and 10

The first person has a 100% chance to guess a previously unguessed number

The second person has a 90% chance to guess a previously unguessed number

The third person has a 80% chance to guess a previously unguessed number

and so on...

The average chance of guessing a new number(by my reasoning) is 55%. But the data doesn't reflect this.

David Greydanus
  • 648
  • 6
  • 19
  • 6
    Shades of (1-1/e)? Note that $1 - 1/e = 0.63212$ Can you see why it has to be this? I thought I had the answer but not sure anymore. Got to look at it some more – user44197 Jan 14 '14 at 03:33
  • 3
    If you want more info about this problem, there's a classical one involving the same maths. It's the odds of having two people(or more) with the same birthday in a classroom. – Feu Jan 14 '14 at 03:42
  • @Feu an important one in cryptography too :) – Cruncher Jan 14 '14 at 14:16
  • 1
    A [related question I asked before](http://math.stackexchange.com/questions/554989/probabilities-for-1-in-n-events-over-n-trials). Turns out $e^{-1}$ is a notable probability. Here's an [older related question](http://math.stackexchange.com/questions/6140/help-with-a-specific-limit-left-dfracn-1n-rightn-as-n-rightarrow). – badroit Jan 14 '14 at 14:57
  • From the programming point of view, don't forget that python's basic random functions are not true random, or even cryptographic random. It is a seeded PRNG. If you want better random numbers, try a cryptographic number generator source such as /dev/random (full hardware generated randomness, runs out easily). – Linuxios Jan 14 '14 at 16:40
  • While it is true that the python random library is not truly random, that isn't the issue here as it is a convincing enough pseudo-random for the sake of this problem. – – David Greydanus Jan 14 '14 at 17:16
  • Check out the derangement problem at https://en.wikipedia.org/wiki/Derangement https://en.wikipedia.org/wiki/Derangement . – Ethan Bolker Dec 22 '15 at 17:46

3 Answers3


Suppose that $n$ guesses are made. For $i=1$ to $100$, let $X_i=1$ if $i$ is not guessed, and let $X_i=0$ otherwise. If $$Y=X_1+X_2+\cdots +X_{100},$$ then $Y$ is the number of numbers not guessed.

By the linearity of expectation, we have $$E(Y)=E(X_1)+E(X_2)+\cdots+E(X_{100}).$$ The probability that $i$ is not chosen in a particular trial is $\frac{99}{100}$, and therefore the probability it is not chosen $n$ times in a row is $\left(\frac{99}{100}\right)^n$. Thus $$E(Y)=100 \left(\frac{99}{100}\right)^n.$$

In particular, let $n=100$. Note that $\left(1-\frac{1}{100}\right)^{100}\approx \frac{1}{e}$, so the expected number not guessed is approximately $\frac{100}{e}$. Thus the expected number guessed is approximately $63.2$, a result very much in line with your simulation.

In general, if $N$ people choose independently and uniformly from a set of $N$ numbers, then the expected number of distinct numbers not chosen is $$N\left(1-\frac{1}{N}\right)^N.$$ Unless $N$ is very tiny, this is approximately $\frac{N}{e}$, and therefore the expected number of distinct numbers chosen is approximately $N-\frac{N}{e}$. Note that the expected proportion of the numbers chosen is almost independent of $N$.

André Nicolas
  • 491,093
  • 43
  • 517
  • 948

In your 1 to 10 example, it's not true in general that the third chooser has an 80% chance on choosing a new number. It's only the case if the second one has guessed a different number than the first one.

  • 103
  • 2
  • 488
  • 2
  • 6

This is an example of the Birthday Paradox / Birthday Problem.

Birthday problem - Wikipedia, the free encyclopedia

I was just looking at my Online Cryptography class video lecture today on this very problem.

Coursera.org: crypto-009

There is an apparent paradox that there is more duplication of numbers than expected when the random numbers are supposedly independent.

But the Birthday Paradox is just one example of when our intuitive statistical sense is dead wrong.

  • 749
  • 1
  • 5
  • 13
  • Here is a cool video from Numberphile about the Birthday problem. http://www.youtube.com/watch?v=a2ey9a70yY0 –  Jan 14 '14 at 12:04
  • The birthday problem is about the probability of having at least one collision. It says very little about the number of different values found in a large number of independent samples (as in "how many different birthdays on average in a random group of $365$ people"). – Marc van Leeuwen Jan 14 '14 at 13:27