9

For two Gaussian-distributed variables, $ Pr(X=x) = \frac{1}{\sqrt{2\pi}\sigma_0}e^{-\frac{(x-x_0)^2}{2\sigma_0^2}}$ and $ Pr(Y=y) = \frac{1}{\sqrt{2\pi}\sigma_1}e^{-\frac{(x-x_1)^2}{2\sigma_1^2}}$. What is probability of the case X > Y?

Strin
  • 1,469
  • 1
  • 19
  • 29
  • 1
    What do you know about $X-Y$ ? – Raskolnikov Aug 03 '12 at 09:55
  • If $X > Y$ what can you say about $X-Y$? – Ilya Aug 03 '12 at 09:58
  • 3
    Are $X$ and $Y$ independent? And I don't agree when you write $Pr(X=x)=\dots$: the probability that a Gaussian random variable take a particular value is $0$ (but we can write $P(X\in A)=\int_A$ of the function you wrote). – Davide Giraudo Aug 03 '12 at 10:05
  • Yes, do we have a name for this? – Strin Aug 03 '12 at 10:50
  • A name for *what*? – Chris Eagle Aug 03 '12 at 11:54
  • @Strin: [Probability density function](http://en.wikipedia.org/wiki/Probability_density_function)? – Nate Eldredge Aug 03 '12 at 16:07
  • @DavideGiraudo: You are quite right, but this is a common abuse of notation. – Nate Eldredge Aug 03 '12 at 16:07
  • @Nate: A common abuse of notation, to write $P(X=x)=f(x)$ to mean that $f$ is the density of $P_X$ with respect to Lebesgue measure? If you wish to indicate that we should not pay attention, I disagree. But this a too common **error**, yes. – Did Aug 03 '12 at 16:32
  • @did: Many otherwise reputable textbooks use this notation deliberately. I personally don't like it either since it is, as you say, also a common error. But I just wanted to point out that someone who writes $P(X=x)$ for a density is not *necessarily* confused. – Nate Eldredge Aug 03 '12 at 16:40
  • @Nate: *Many otherwise reputable textbooks use this notation deliberately*... OK. (But not in the part of the world where I live.) Any examples? – Did Aug 03 '12 at 16:45

2 Answers2

7

Suppose $X$ and $Y$ are jointly normal, i.e. no independence is needed. Define $Z = X - Y$. It is well known that $Z$ is Gaussian, and thus is determined by its mean $\mu$ and its variance $\sigma^2$. $$ \mu = \mathbb{E}(Z) = \mathbb{E}(X) - \mathbb{E}(Y) = \mu_1 - \mu_2 $$ $$ \sigma^2 = \mathbb{Var}(Z) = \mathbb{Var}(X) + \mathbb{Var}(Y) - 2 \mathbb{Cov}(X,Y) = \sigma_1^2 + \sigma_2^2 - 2 \rho \sigma_1 \sigma_2 $$ where $\rho$ is the correlation coefficient. Now: $$ \mathbb{P}(X>Y) = \mathbb{P}(Z>0) = 1- \Phi\left(-\frac{\mu}{ \sigma}\right) = \Phi\left(\frac{\mu}{ \sigma}\right) = \frac{1}{2} \operatorname{erfc}\left(-\frac{\mu}{\sqrt{2}\sigma}\right) $$

Sasha
  • 68,169
  • 6
  • 133
  • 210
2

I assume that $X$ and $Y$ are independent. Let $Z=X-Y$ then $Z\sim\cal{N}(x_0-y_0,\sigma_0^2+\sigma_1^2)$. Accordingly

$$P(Z>0)=\int_0^\infty\frac{1}{\sqrt{2\pi(\sigma_0^2+\sigma_1^2)}}\exp\left(\frac{-(z-x_0+y_0)^2}{2(\sigma_0^2+\sigma_1^2)}\right)\mathrm{d}z$$

if we use the complementary error function $$\operatorname{erf}c(x)=\frac{2}{\sqrt\pi}\int_x^\infty e^{-t^2}dt$$ with $t=\frac{z-x_0+y_0}{\sqrt{2(\sigma_0^2+\sigma_1^2)}}$, we have $\sqrt{2(\sigma_0^2+\sigma_1^2)}dt=dz$ $$P(Z>0)=\frac{2}{2\sqrt{\pi}\sqrt{2(\sigma_0^2+\sigma_1^2)}}\int_{t=\frac{y_0-x_0}{\sqrt{2(\sigma_0^2+\sigma_1^2)}}}^\infty e^{-t^2}\sqrt{2(\sigma_0^2+\sigma_1^2)}dt$$ and we get finally $$P(Z>0)=\frac{1}{2}\operatorname{erfc}\left(\frac{y_0-x_0}{\sqrt{2(\sigma_0^2+\sigma_1^2)}}\right)$$

Seyhmus Güngören
  • 7,666
  • 3
  • 24
  • 43
  • 3
    Which has the odd feature of not being $1/2$ when $x_0=y_0$. I suggest to review this answer, especially the change of variable. – Did Aug 03 '12 at 10:54
  • Ok I found the mistake. Updating. – Seyhmus Güngören Aug 03 '12 at 11:30
  • Well, there still seems to be something wrong.The odd feature noted by @did still persists: the probability is not $1/2$ when $x_0 = y_0$ and worse yet, the right side is negative when $x_0 > y_0$ and so definitely cannot be a probability. Did you mean to write erfc instead of erf in your final answer? – Dilip Sarwate Aug 03 '12 at 15:46
  • Yes it has to be erfc of course. It is obvious in the text. Proof is correct. – Seyhmus Güngören Aug 03 '12 at 16:26