A teenage acquaintance of mine lamented:

Every one of my friends is better friends with somebody else.

Thanks to my knowledge of mathematics I could inform her that she's not alone and $e^{-1}\approx 37\%$ of all people could be expected to be in the same situation, which I'm sure cheered her up immensely.

This number assumes that friendships are distributed randomly, such that each person in a population of $n$ chooses a best friend at random. Then the probability that any given person is not anyone's best friend is $(1-\frac{1}{n-1})^{n-1}$, which tends to $e^{-1}$ for large $n$.

Afterwards I'm not sure this is actually the best way to analyze the claim. Perhaps instead we should imagine assigning a random "friendship strength" to each edge in the complete graph on $n$ vertices, in which case my friend's lament would be "every vertex I'm connected to has an edge with higher weight than my edge to it". This is not the same as "everyone choses a best friend at random", because it guarantees that there's at least one pair of people who're mutually best friends, namely the two ends of the edge with the highest weight.

(Of course, some people are not friends at all; we can handle that by assigning low weights to their mutual edges. As long as everyone has at least one actual friend, this won't change who are whose best friends).

(It doesn't matter which distribution the friendship weights are chosen by, as long as it's continuous -- because all that matters is the relative order between the weights. Equivalently, one may simply choose a random total order on the $n(n-1)/2$ edges in the complete graph).

In this model, what is the probability that a given person is not anyone's best friend?

By linearity of expectations, the probability of being mutually best friends with anyone is $\frac{n-1}{2n-3}\approx\frac 12$ (much better than in the earlier model), but that doesn't take into account the possibility that some poor soul has me as their best friend whereas I myself has other better friends. Linearity of expectation doesn't seem to help here -- it tells me that the expected number of people whose best friend I am is $1$, but not the probability of this number being $0$.

(Edit: Several paragraphs of numerical results now moved to a significantly expanded answer)

hmakholm left over Monica
  • 276,945
  • 22
  • 401
  • 655
  • I wonder what, say, Facebook's data on this would look like, where "best friend" would be defined as "person most interacted with on Facebook". Of course this has nothing to do with your model. – Michael Lugo Mar 14 '14 at 21:43
  • 1
    OT: this reminds me of the Peanuts' strip where everybody came with a bunch of Valentine cards, and eventually the only one who did not receive a card was Charlie Brown :-) – mau Mar 15 '14 at 18:10
  • You don't say what distribution you use for the friendship strengths: do you assign them uniformly and independently from $[0, 1]$? (If so, wouldn't a sharply declining distribution, like exponential or power-law, be more realistic?) – ShreevatsaR Mar 16 '14 at 02:39
  • 3
    @ShreevatsaR: Since only the relative order between the friendship strengths matter, any continuous distribution will do. Or, equivalently, one may pick a total ordering of the $n(n-1)/2$ edges uniformly between all orderings. – hmakholm left over Monica Mar 16 '14 at 02:44
  • Oh I see. That's a useful insight, thanks! – ShreevatsaR Mar 16 '14 at 02:45
  • It reminds me of [Ramsey's theorem](http://en.wikipedia.org/wiki/Ramsey's_theorem). – Astro Nauft Mar 16 '14 at 17:27
  • @HenningMakholm Out of curiosity, would you say that the agreement with $f(n)$ is unusually good, or is it just pointing at an asymptotic expansion of the form $p(n) = e^{-1} ( 1 - 1/4n + O(1/n^2) )$ ? – Erick Wong Mar 16 '14 at 22:34
  • 2
    @ErickWong: My numerical data (to more digits than I've quoted here) look like they agree to $o(n^2)$, so both the $n^{-1}$ and $n^{-2}$ terms of the asymptotic expansion match. It could still be coincidence, of course, but it looks less likely because the polynomial $2n-7/3$ has fewer coefficients than the number of terms in the asymptotic expansion that seem to match. – hmakholm left over Monica Mar 16 '14 at 22:49

3 Answers3


The probability for large $n$ is $e^{-1}$ in the friendship-strength model too. I can't even begin to explain why this is, but I have strong numerical evidence for it. More precisely, if $p(n)$ is the probability that someone in a population of $n$ isn't anyone's best friend, then it looks strongly like

$$ p(n) = \Bigl(1-\frac{1}{2n-7/3}\Bigr)^{2n-7/3} + O(n^{-3})$$ as $n$ tends to infinity.

The factor of $2$ may hint at some asymptotic connection between the two models, but it has to be subtle, because the best-friend relation certainly doesn't look the same in the two models -- as noted in the question, in the friendship-strength model we expect half of all people to be mutually best friends with someone, whereas in the model where everyone chooses a best friend independently, the total expected number of mutual friendships is only $\frac12$.

The offset $7/3$ was found experimentally, but there's good evidence that it is exact. If it means anything, it's a mystery to me what.

How to compute the probability. Consider the complete graph on $n$ vertices, and assign random friendship weights to each edge. Imagine processing the edges in order from the strongest friendship towards the weakest. For each vertex/person, the first time we see an edge ending there will tell that person who their best friend is.

The graphs we build can become very complex, but for the purposes of counting we only need to distinguish three kinds of nodes:

  • W - Waiting people who don't yet know any of their friends. (That is, vertices that are not an endpoint of any edge processed yet).
  • F - Friends, people who are friends with someone, but are not anyone's best friend yet. Perhaps one of the Waiting people will turn out to have them as their best friend.
  • B - Best friends, who know they are someone's best friend.

At each step in the processing of the graph, it can be described as a triple $(w,f,b)$ stating the number of each kind of node. We have $w+f+b=n$, and the starting state is $(n,0,0)$ with everyone still waiting.

  • If we see a WW edge, two waiting people become mutually best friends, and we move to state $(w-2,f,b+2)$. There are $w(w-1)/2$ such edges.
  • If we see a WF edge, the F node is now someone's best friend and becomes a B, and the W node becomes F. The net effect is to move us to state $(w-1,f,b+1)$. There are $wf$ such edges.
  • If we see a WB edge, tne W node becomes F, but the B stays a B -- we don't care how many people's best friends one is, as long there is someone. We move to $(w-1,f+1,b)$, and there are $wb$ edges of this kind.
  • If we see a FF or FB or BB edge, it represents a friendship where both people already have better friends, so the state doesn't change.

Thus, for each state, the next WW or WF or WB edge we see determine which state we move to, and since all edges are equally likely, the probabilities to move to the different successor states are $\frac{w-1}{2n-w-1}$ and$\frac{2f}{2n-w-1}$ and $\frac{2b}{2n-w-1}$, respectively.

Since $w$ decreases at every move between states, we can fill out a table of the probabilities that each state is ever visited simply by considering all possible states in order of decreasing $w$. When all edges have been seen we're in some state $(0,f,n-f)$, and summing over all these we can find the expected $f$ for a random weight assignment.

By linearity of expectation, the probability that any given node is F at the end must then be $\langle f\rangle/n$.

Since there are $O(n^2)$ states with $w+f+b=n$ and a constant amount of work for each state, this algorithm runs in time $O(n^2)$.

Numerical results. Here are exact results for $n$ up to 18:

 n approx                 exact
 1  100%                   1/1
 2  0.00%                  0/1
 3 33.33%                  1/3
 4 33.33%                  1/3
 5 34.29%                 12/35
 6 34.81%                 47/135
 7 35.16%                731/2079
 8 35.40%               1772/5005
 9 35.58%              20609/57915
10 35.72%            1119109/3132675
11 35.83%             511144/1426425
12 35.92%           75988111/211527855
13 36.00%         1478400533/4106936925
14 36.06%        63352450072/175685635125
15 36.11%      5929774129117/16419849744375
16 36.16%     18809879890171/52019187845625
17 36.20%    514568399840884/1421472473796375
18 36.24% 120770557736740451/333297887934886875

After this point, exact rational arithmetic with 64-bit denominators start overflowing. It does look like $p(n)$ tends towards $e^{-1}$. (As an aside, the sequence of numerators and denominators are both unknown to OEIS).

To get further, I switched to native machine floating point (Intel 80-bit) and got the $p(n)$ column in this table:

   n     p(n)    A      B      C      D      E      F      G      H
   10 .3572375 1.97+  4.65-  4.65-  4.65-  2.84+  3.74+  3.64+  4.82-
   20 .3629434 2.31+  5.68-  5.68-  5.68-  3.47+  4.67+  4.28+  5.87-
   40 .3654985 2.62+  6.65-  6.64-  6.64-  4.09+  5.59+  4.90+  6.84-
   80 .3667097 2.93+  7.59-  7.57-  7.57-  4.70+  6.49+  5.51+  7.77-
  100 .3669469 3.03+  7.89-  7.87-  7.86-  4.89+  6.79+  5.71+  8.07-
  200 .3674164 3.33+  8.83-  8.79-  8.77-  5.50+  7.69+  6.32+  8.99-
  400 .3676487 3.64+  9.79-  9.69-  9.65-  6.10+  8.60+  6.92+  9.90-
  800 .3677642 3.94+ 10.81- 10.60- 10.52-  6.70+  9.50+  7.52+ 10.80-
 1000 .3677873 4.04+ 11.17- 10.89- 10.80-  6.90+  9.79+  7.72+ 11.10-
 2000 .3678334 4.34+ 13.18- 11.80- 11.63-  7.50+ 10.69+  8.32+ 12.00-
 4000 .3678564 4.64+ 12.74+ 12.70- 12.41-  8.10+ 11.60+  8.92+ 12.90-
 8000 .3678679 4.94+ 13.15+ 13.60- 13.14-  8.70+ 12.50+  9.52+ 13.81-
10000 .3678702 5.04+ 13.31+ 13.89- 13.36-  8.90+ 12.79+  9.72+ 14.10-
20000 .3678748 5.34+ 13.86+ 14.80- 14.03-  9.50+ 13.69+ 10.32+ 15.00-
40000 .3678771 5.64+ 14.44+ 15.70- 14.67- 10.10+ 14.60+ 10.92+ 15.91-

The 8 other columns show how well $p(n)$ matches various attempts to model it. In each column I show $-\log_{10}|p(n)-f_i(n)|$ for some test function $f_i$ (that is, how many digits of agreement there are between $p(n)$ and $f_i(n)$), and the sign of the difference between $p$ and $f_i$.

  • $f_{\tt A}(n)=e^{-1}$

In the first column we compare to the constant $e^{-1}$. It is mainly there as evidence that $p(n)\to e^{-1}$. More precisely it looks like $p(n) = e^{-1} + O(n^{-1})$ -- whenever $n$ gets 10 times larger, another digit of $e^{-1}$ is produced.

  • $f_{\tt C}(n)=\Bigl(1-\frac{1}{2n-7/3}\Bigr)^{2n-7.3}$

I came across this function by comparing $p(n)$ to $(1-\frac{1}{n-1})^{n-1}$ (the probability in the chose-best-friends-independently model) and noticing that they almost matched beteen $n$ and $2n$. The offset $7/3$ was found by trial and error. With this value it looks like $f_{\tt C}(n)$ approximates $p(n)$ to about $O(n^{-3})$, since making $n$ 10 times larger gives three additional digits of agreement.

  • $ f_{\tt B}=\Bigl(1-\frac{1}{2n-2.332}\Bigr)^{2n-2.332}$ and $f_{\tt D}=\Bigl(1-\frac{1}{2n-2.334}\Bigr)^{2n-2.334} $

These columns provide evidence that $7/3$ in $f_{\tt C}$ is likely to be exact, since varying it just slightly in each direction gives clearly worse approximation to $p(n)$. These columns don't quite achieve three more digits of precision for each decade of $n$.

  • $ f_{\tt E}(n)=e^{-1}\bigl(1-\frac14 n^{-1}\bigr)$ and $f_{\tt F}(n)=e^{-1}\bigl(1-\frac14 n^{-1} - \frac{11}{32} n^{-2}\bigr)$

Two and three terms of the asymptotic expansion of $f_{\tt C}$ in $n$. $f_{\tt F}$ also improves cubically, but with a much larger error than $f_{\tt C}$. This seems to indicate that the specific structure of $f_{\tt C}$ is important for the fit, rather than just the first terms in its expansion.

  • $ f_{\tt G}(n)=e^{-1}\bigl(1-\frac12 (2n-7/3)^{-1}\bigr)$ and $f_{\tt H}(n)=e^{-1}\bigl(1-\frac12 (2n-7/3)^{-1}-\frac{5}{24} (2n-7/3)^{-2}\bigr) $

Here's a surprise! Expanding $f_{\tt C}$ in powers of $2n-7/3$ instead of powers of $n$ not only gives better approximations than $f_{\tt E}$ and $f_{\tt F}$, but also approximates $p(n)$ better than $f_{\tt C}$ itself does, by a factor of about $10^{0.2}\approx 1.6$. This seems to be even more mysterious than the fact that $f_{\tt C}$ matches.

At $n=40000$ the computation of $p(n)$ takes about a minute, and the comparisons begin to push the limit of computer floating point. The 15-16 digits of precision in some of the columns are barely even representable in double precision. Funnily enough, the calculation of $p(n)$ itself seems to be fairly robust compared to the approximations.

hmakholm left over Monica
  • 276,945
  • 22
  • 401
  • 655

Here's another take at this interesting problem. Consider a group of $n+1$ persons $x_0,\dots,x_n$ with $x_0$ being myself. Define the probability $$ P_{n+1}(i)=P(\textrm{each of $x_1,\dots,x_i$ has me as his best friend}). $$ Then we can compute the wanted probability using the inclusion-exclusion formula as follows: \begin{eqnarray} P_{n+1} &=& P(\textrm{I am nobody's best friend}) \\ &=& 1-P(\textrm{I am somebody's best friend}) \\ &=& \sum_{i=0}^n (-1)^i\binom{n}{i}P_{n+1}(i). \tag{$*$} \end{eqnarray}

To compute $P_{n+1}(i)$, note that for the condition to be true, it is necessary that of all friendships between one of $x_1,\dots,x_i$ and anybody, the one with the highest weight is a friendship with me. The probability of that being the case is $$ \frac{i}{in-i(i-1)/2}=\frac{2}{2n-i+1}. $$ Suppose, then, that that is the case and let this friendship be $(x_0, x_i)$. Then I am certainly the best friend of $x_i$. The probability that I am also the best friend of each of $x_1,\dots,x_{i-1}$ is unchanged. So we can repeat the argument and get $$ P_{n+1}(i) =\frac{2}{2n}\cdot\frac{2}{2n-1}\cdots\frac{2}{2n-i+1} =\frac{2^i(2n-i)!}{(2n)!}. $$ Plugging this into $(*)$ gives a formula for $P_{n+1}$ that agrees with Henning's results.

To prove $P_{n+1}\to e^{-1}$ for $n\to\infty$, use that the $i$-th term of $(*)$ converges to, but is numerically smaller than, the $i$-th term of $$ e^{-1}=\sum_{i=0}^\infty\frac{(-1)^i}{i!}. $$

By the way, in the alternative model where each person chooses a best friend at random, we would instead have $P_{n+1}(i)=1/n^i$ and $P_{n+1}=(1-1/n)^n$.

  • 417
  • 4
  • 8

This answer is quite similar to Lancel's answer, but I think it deals successfully with the subtle issue of dependence between friendships (if I am the best friend of $x_1$, then I am slightly more likely to also be the best friend of $x_2$, because my being the best friend of $x_1$ suggests that the friendship strength between $x_1$ and $x_2$ is "low", making it more likely that my strength to $x_2$ beats all of the other friendship strengths for $x_2$). This answer also avoids needing to worry about convergence of the power series.

We use Brun's Sieve, which is essentially a repackaging of the inclusion-exclusion method. Suppose that there are $n$ people aside from myself in the graph, and let $x_1, \ldots, x_r$ be any set of $r$ people. Let $B_i$ be the event that I am the best friend of person $i$. We need to show that, for each fixed $r$, as $n \to \infty$ we have ${n \choose r}P(B_1 \wedge \cdots \wedge B_r) \to \frac{1}{r!}$. To do this, we will show that $P(B_1 \wedge \cdots \wedge B_r) \to \frac{1}{n^r}$.

Now $P(B_1) = 1/n$: of the $n$ people that $x_1$ sees, I must have the highest strength. For higher $i$ we need to obtain conditional probabilities, and we only obtain upper and lower estimates, but these estimates will asymptotically approach each other as $n \to \infty$. For $P(B_2 | B_1)$, we have the lower bound $P(B_2 | B_1) \geq P(B_2) \geq 1/n$, since conditioning on $B_1$ only "pushes down" the value of the edge $x_1x_2$ (it is somewhat easier to see this by flipping the situation around: if we consider controlling the value of $x_1x_2$ and picking all other edge weights randomly, the probability of $B_1$ and the probability of $B_2$ each go down as the weight of $x_1x_2$ goes up). On the other hand, $P(B_2 | B_1) \leq 1/(n-1)$, since my weight to $x_2$ needs to beat each of $x_2$'s weights to vertices other than $x_1$, and conditioning on $B_1$ gives no information about those weights.

Continuing in this manner, we see that $1/n \leq P(B_i | B_1, \ldots, B_{i-1}) \leq 1/(n-i+1)$ for each $i$. When $r$ is fixed and $n \to \infty$, this gives $P(B_i | B_1, \ldots, b_{i-1}) \to 1/n$, so that $P(B_1 \wedge \cdots \wedge B_r) \to 1/n^r$.

Thus, by Brun's Sieve, we have $P\left(\bigwedge\overline{B_i}\right) \to e^{-1}$, and, what's more, the number of people whose best friend you are is (asymptotically) Poisson-distributed with mean $1$.

Gregory J. Puleo
  • 6,275
  • 4
  • 21
  • 28