The $F_1$ score is the harmonic mean of precision (Pr) and recall (Re). Given a confusion (or contingency) matrix with counts of True Positives (TP), True Negatives (TN), False Positives (FP), False Negatives (FN), and TP+TN+FP+FN = N, we have: $$Pr = \frac{TP}{TP+FP}$$ $$Re = \frac{TP}{TP+FN}$$ $$F_1=\frac{2(Pr\times Re)}{Pr+Re}=\frac{2TP}{N+TP-TN}$$ Let's assume that we randomly guess each binary label. This gives: $TP=X\sim Bin(n_1,0.5)$, where $n_1$ is the number of 1s in the truth vector (equivalent to $TP+FN$).

Similarly, let's assume $TN=Y\sim Bin(n_0,0.5)$, where $n_0$ is the number of 0s in the truth vector (equivalent to $TN+FP$).

I am trying to find an expression for $p(Z)$ given: $$Z=\frac{2X}{N+X-Y}$$ My approach is to set $2X = A$ and $N + X - Y = B$, find their PDFs, and then determine their ratio distribution.

**In the numerator:** $A$ is a modified Binomial with $p(a;n_1,p)= {{n_1}\choose{a/2}}0.5^{a/2}(1-0.5)^{n_1-a/2}$, where $a=0, 2, 4, ... , 2n_1$ (jbowman).

**In the denominator:** Since $N=n_1+n_0$ and $p$ is the same in $X$ and $Y$, ($X-Y+n_0)\sim Bin(n_1+n_0, 0.5)$. Therefore, $B\sim n_1 + Bin(N,0.5)$ from which we can derive $p(B)$ (martini) and (Robert Israel).

Now, in order to find the ratio distribution, I need to know the joint distribution of $A$ and $B$, which I believe are dependent on each other. Therefore, we need to use: $p(A\cap B)=p(A|B)p(B)$, which is where I am stuck. Is there enough information to get the conditional distributions? Or is there another way I should approach the problem? Thanks!