12

Let $Z \in \mathbb{R}^2$ be an i.i.d. Gaussian vector with mean $M$ where $P_{Z\mid M}$ is its distribution.

Let $g: \mathbb{R}^2 \to \mathbb{R}$ and consider the following equation: $$ E[g(Z)\mid M=\mu]=0, \forall \mu \in C, $$ where $C=\{\mu: \frac{\mu_1^2}{r_1^2}+\frac{\mu_2^2}{r_2^2}=1 \}$ for some given $r_1,r_2 > 0$. That is, $C$ is an ellipse.

It is not difficult to verify (see this question, see also Edit 3) that a solution to this equation is given by $$ g(x)= \frac{x_1^2}{r_1^2}+\frac{x_2^2}{r_2^2}- c, $$ where $c=\frac{1}{r_1^2}+\frac{1}{r_2^2}+1$. In fact any function $g_a(x)= a g(x)$ for any $a \in \mathbb{R}$ is a solution.

Question: Is $g$ a unique solution up to a multiplicative constant?

Edit: If we need to make an assumption on the class of allowed functions $g$. Let us assume that $g$'s are bounded by a quadratic monomial (i.e., for every $g$ there exists $a$ and $b$ such that $g(x) \le a \|x \|^2 +b$).

Edit 2: If you want to avoid expectation notation. Everything can be alternatively written as $$ \iint g(z)\frac{1}{2 \pi} e^{-\frac{\|z-m\|^2}{2}} \, {\rm d} z=0, \, m\in C. $$ From here, one can see that this question is about a convolution.

Edit 3: To see that $g(x)$ is a solution we use that the second moment of Gaussian is given by $E[Z_i^2\mid M_i=\mu_i]=1+\mu_i^2$, which leads to \begin{align} E\left[\frac{Z_1^2}{r_1^2}+\frac{Z_2^2}{r_2^2}- c\mid M=\mu \right]&= \frac{E[Z_1^2\mid M_1=\mu_1]}{r_1^2}+\frac{ E[Z_2^2\mid M_2=\mu_2]}{r_2^2}-c\\[6pt] &=\frac{1+\mu_1^2}{r_1^2}+\frac{ 1+\mu_1^2}{r_2^2}-c\\[6pt] &=\frac{1}{r_1^2}+\frac 1 {r_2^2}+1-c, \end{align} where in the last step we used that $\mu$ is on the ellipse.

Edit 4: The comment below shows that the solution is not unique when an ellipse is a circle. However, I would still like to know the answer for a general ellipse.

Michael Hardy
  • 1
  • 30
  • 276
  • 565
Boby
  • 5,407
  • 3
  • 19
  • 53
  • What is M? A constant? The notation $P_{Z|M}$ comes at one place, do we need it? What is it, if we need it?! – dan_fulea Jun 06 '21 at 15:26
  • @dan_fulea I used this for compactness as I didn't want to write an integral. The pdf of $P_{Z|M}$ is given by $f_{Z|M}(z|m)=\frac{1}{\sqrt{2 \pi}} e^{-\frac{\|z-m\|^2}{2}}$. So everything can alternatively be written as $\iint g(z) \frac{1}{\sqrt{2 \pi}} e^{-\frac{\|z-m|^2}{2}} {\rm d} z=0, \, m\in C$. – Boby Jun 06 '21 at 16:03
  • 2
    As it stays, the question is irritating for me. I do not want to go into the links to get details. We never see $P$ in the question again, so why do we need it? Please mention that $M$ is a constant, if it is so, this lets me read after the first line. The index $Z|M$ is irritating in the context, i vote to delete this detail... Of course i know the Gaussian story, in the above comment there is but a special variance... Please make all these details clear in the question. – dan_fulea Jun 06 '21 at 16:08
  • 1
    @dan_fulea Thank you for your suggestions, I made two edits to the question, but I decided to leave most of it as is as I think this notation is rather standard. – Boby Jun 06 '21 at 16:22
  • I think by "the solution to this equation", you mean "a solution to this equation" – Paresseux Nguyen Jun 07 '21 at 21:14
  • @ParesseuxNguyen Yes, I will correct this. – Boby Jun 07 '21 at 21:46
  • 4
    (+1) Nice question. It's a pity that there is no answer for the nontrivial eclipse case. – Paresseux Nguyen Jun 13 '21 at 19:52
  • @ParesseuxNguyen Thanks. I will put a new bounty on it. I really want to know what is the answer. – Boby Jun 13 '21 at 20:13
  • 1
    @Boby Let me set the bounty for you. However, I'm afraid there still will be no solution. – Paresseux Nguyen Jun 13 '21 at 20:15
  • 1
    @ParesseuxNguyen Sure go ahead. Thanks. But I think we have to wait for another 18hr. – Boby Jun 13 '21 at 20:34
  • @ParesseuxNguyen Ok. It looks like we can post it. – Boby Jun 14 '21 at 18:53
  • @Boby Sure ;). Let's see what will happen – Paresseux Nguyen Jun 14 '21 at 19:03
  • Hi, I edited my answer to address in one place all concerns and my findings, thanks for the problem! – Luca Ghidelli Jun 18 '21 at 18:39
  • @LucaGhidelli Thank you! – Boby Jun 19 '21 at 01:19
  • You say "Where $P_{Z\,\mid\,M}$ is$\,\ldots$", but then you never use that notation is the rest of what you write. Why introduce a notation that you never use? – Michael Hardy Jun 19 '21 at 18:29
  • You wrote "That is, $C$ is an ellipse." But the foregoing expression says more than that $C$ is an ellipse; it says that $C$ is an ellipse whose major and minor axes are parallel the coordinate axes and whose center is the origin. – Michael Hardy Jun 19 '21 at 18:33
  • You seem to assume that "Gaussian" means "Gaussian with variance $1.$" That is not standard: For every $\mu\in\mathbb R$ and every $\sigma^2>0,$ there is a Gaussian distribution with expected value $\mu$ and variance $\sigma^2.$ If $X\sim\operatorname N(\mu,\sigma^2)$ then $\operatorname E(X^2) = \mu^2 + \sigma^2,$ so this is not necessarily $\mu^2+1. \qquad$ – Michael Hardy Jun 19 '21 at 18:37

2 Answers2

7

That was a very natural question to me, and I really enjoyed the journey of exploring it! Let me give three different answers, which for me correspond to three different levels of understanding of the problem.

(Major edit on June 18th, 2021)

Three approaches

The three answers are based respectively on

  1. functions with circular symmetry: this approach gives solutions for the case of a circle;
  2. rescaling quasi-symmetry of gaussians: this approach produces new solutions from known ones, and it works for general ellipses;
  3. normal convolutional inverse: this approach, which I prefer to carry out using Hermite polynomials, in principle works for very general subsets, not only for ellipses.

Here I provide, briefly, the three solutions. More details on each of them are postponed to the end of this post.


Solution 1: The case of a circle and circular symmetry

Here I discuss the case of a circle ($r_1=r_2=R^2$).

Proposition 1. Let $g_1,g_2$ be two distinct functions on $\mathbb R^2$ with circular symmetry. Let $W$ be an i.i.d. standard Gaussian vector with mean $M=(R,0)$. Let $E_i$ be the expectation of $g_i(W)$. Then $g(x) = E_1 g_2(x) - E_2 g_1(x_1)$ is a solution.

Here is the idea: if the ellipse is a circle, since the standard gaussian distributions look the same from every point of view, we have that the problem has rotational symmetry. Then it is natural to search for solution that are rotationally symmetric themselves.

Remark 1: In Proposition 1 we already see, at least in the case of a circle, that the conditions of the problem (namely, those of vanishing expectation when the mean varies along an ellipse) gives a system of equations that is not enough restrictive to cut out a one-dimensional set of solutions.


Solution 2: A family of solutions for the general case of an ellipse

Let now $r_1,r_2>0$ be arbitrary, so we consider the case of an ellipse of equation $\frac {x_1^2}{r_1} + \frac {x_2^2}{r_2} =1.$

We have already seen (in the question itself) that the function $$g_r(x)= \left(\frac {x_1^2}{r_1} + \frac {x_2^2}{r_2}\right)- \left(\frac 1 {r_1} + \frac 1 {r_2} +1\right)$$ is a solution to the problem. We now construct a one-parameter family of solutions.

Proposition 2: For every $t>-1/2$, the function $$ g_{r,t}(x) = \left ((1+2t)^2 \left(\frac {x_1^2}{r_1} + \frac {x_2^2}{r_2}\right)- (1+2t)\left(\frac {1}{r_1} + \frac {1}{r_2}\right) -1\right) e^{-t|x|^2 }$$ is a solution to the problem.

Here is the idea: Gaussians behave well with respect to multiplications with other gaussians. Moreover, if a function can be written as $g(x)=h(x)e^{-t|x|^2}$ for some reasonably growing factor $h(x)$, then it should be automatically behave mildly at infinity, as required at some point in the question. Then, it makes sense to look for solutions in this form. In this particular case, it turns out that the factor $h(x)$ can be derived from a solution to the problem associated with another smaller ellipse.

Remark 2: With Proposition 2, we see that there are infinitely many solutions which are bounded by a quadratic monomial. In fact, there are many solutions that tend to zero at infinity: those corresponding to parameters $t>0$.


Solution 3: Inverse of the Gaussian convolution

Given an integrable function $h:\mathbb R^2\to\mathbb{C}$ we define, for every $m\in\mathbb R^2$, the shifted gaussian average operator $$ E_m[h] = \frac 1 {2\pi}\iint h(x) e^{-|x-m|^2/2} dx_1 dx_2 .$$

We also define, for every $x\in\mathbb R^2$, the dual operator $\hat E_x[\cdot]$ given by $$ \hat E_x[h] = \frac 1 {2\pi}\iint h(ix) e^{-|m+ix|^2/2} dm_1 dm_2 .$$

Proposition 3: Let $f\in\mathbb C[m_1,m_2]$ be a polynomial function that vanishes identically on the ellipse. Then the function $g(x)=\hat E_x[f]$ is a solution to the problem.

Here is the idea: The operator $E_m[\cdot]$ corresponds to the operation of convolutions with a standard gaussian distribution centered at $m\in \mathbb R^2$. The dual operator $\hat E_x[\cdot]$ is basically an inverse operation that "unwinds" the convolution.

Remark 3.1: The functions in Proposition 3 do not necessarily have to be polynomials, but I decided to restrict to this case, which is actually very natural to me.

Remark 3.2: I preferred to treat this case using the theory of Hermite polynomials (more details below, in the appropriate section) and I hope not to have done any mistake. The main advantage is that it makes the theory practical for computations (see the next section). Another advantage is that the proofs are derived directly from known properties of Hermite polynomials.


Examples for solution 3: Hermite polynomials

I strongly believe that abstract solutions should come with practical examples, whenever possible.

Example 1: Let $f(m) = m_1^2 /r_1 + m_2^2/r_2 - 1$. This is a function that vanish identically on the ellipse. Now let $g(x) = \hat E_x [f]$.

At the end of the post, I will show that the operator $\hat E_x[\cdot]$ can be computed explicitly as follows on monomials $$\hat E_x[m_1^pm_2^q] = He_p(x_1) He_q(x_2), $$ where $He_n(t)$ denote the (probabilyst's) Hermite polynomials $$ He_0(t) = 1 ,$$ $$ He_1(t) = t ,$$ $$ He_2(t) = t^2 -1, $$ $$ He_3(t) = t^3 - 3t, $$ $$ He_4(t) = t^4 - 6t^2 + 3 , etc.$$

Therefore we have that $$ g(x) = He_2(x_1) / r_1 + He_2(x_2)/ r_2 - 1 $$ is the function found by Boby already in the question of this problem.

Example 2: Let $f(m) = \left( m_1^2 /r_1 + m_2^2/r_2 - 1\right)^2$. This function vanishes on the ellipse identically.

We have $$ f(m) = \frac {m_1^4} {r_1^2} + \frac {m_2^4} {r_2^2} + \frac {2m_1^2 m_2^2} {r_1r_2} - \frac {2m_1^2} {r_1} - \frac {2m_2^2} {r_2} + 1.$$ Therefore if we let $g(x) = \hat E_x[f]$, we have that $$ g(x) = \frac {He_4(m_1)} {r_1^2} + \frac {He_4(m_2)} {r_2^2} + \frac {2He_2(m_1)He_2(m_2)} {r_1r_2} - \frac {2He_2(m_1)} {r_1} - \frac {2He_2(m_1)} {r_2} + 1$$ is another solution to the original problem.


Proof of Proposition 1

Since the two functions $g_1, g_2$ have circular symmetry, it means that if we compute the averages with respect to a Gaussian centered in any point of the circle, then we get the same results $E_1$ and $E_2$. If we want to see it explicitly, we can write down the change of variables given by the rotation of the plane that sends a point of the circle $m$ to $(R,0)$. Now, since the averages of $g_i$ are $E_i$, we see that the expectation of their combination $g=E_1 g_2 - E_2 g_1$ is simply $E_1E_2−E_2E_1=0$. This shows that $g$ is a solution to the problem posed. The key point here is that the expectation of a function with circular symmetry doesn't change if we rotate the probability measure .


Proof of Proposition 2

Let $f_r(x) = \frac {x_1^2}{r_1} + \frac {x_2^2}{r_2} $ and $ c_r= \frac 1 {r_1} + \frac 1 {r_2} +1 $.

We make the $u$-substitution $$(u_1,u_2) = (\sqrt {1+2t} x_1,\sqrt {1+2t} x_2) ,$$ so that $e^{-t|x|^2}e^{-|x|^2/2} = e^{-|u|^2/2} $.

Then, given $m=(m_1,m_2)$, the expression $e^{-t|x|^2 }e^{-|x-m|^2/2}$ is equal to: $$ = e^{-\frac 1 2 \left((1+2t)(x_1^2+x_2^2) -2m_1x_1-2m_2x_2 + m_1^2+m_2^2\right)} $$ $$ = e^{-\frac {1} {2} |u-\frac m {\sqrt{1+2t}}|^2 } e^{ -\frac {1} {2} |m|^2+\frac {1} {2} \frac{|m|^2}{1+2t}} .$$

Now, let $$d_{r,t}=(1+2t)\left(\frac {1}{r_1} + \frac {1}{r_2}\right) +1.$$ The convolution of the gaussian with $g_{r,t}$ is $$ \frac 1{2\pi} \iint g_{r,t}(x) e^{-|x-m|^2/2} dx$$ $$ = \frac 1{2\pi} \iint \left( (1+2t)^2f_r(x) -d_{r,t}\right) e^{-t|x|^2}e^{-|x-m|^2/2} dx $$ $$ = \frac 1 {2\pi} \frac 1 {1+2t} e^{-\frac t {1+2t} |m|^2} \iint \left( (1+2t)^2f_r(u/\sqrt{1+2t}) - d_{r,t}\right) e^{-\frac 1 2 \left|u-\frac m{\sqrt {1+2t}}\right|^2} du $$ $$ = \frac 1 {2\pi} \frac 1 {1+2t} e^{-\frac t {1+2t} |m|^2} \iint \left( (1+2t)f_r(u) - d_{r,t}\right) e^{-\frac 1 2 \left|u-\frac m{\sqrt {1+2t}}\right|^2} du .$$

Note that $m/\sqrt{1+2t}$ belongs to the ellipse of equation $f_{r/\sqrt{1+2t}} (x)= 1$. Moreover, we have $$(1+2t)f_r(u) - d_{r,t} = f_{r/\sqrt{1+2t}}(u) - c_{r/\sqrt{1+2t}}.$$ In fact, we designed the function $g_{r,t}$ precisely to get to this point.

Finally, we observe that the double integral in the displayed equation is equal to zero, because it is an instance of the solution originally given by Boby in the question, for the shrinked ellipse of equation $f_{r/\sqrt{1+2t}} (x)= 1$.


Proof of Proposition 3

First, let me recall that the normal moments in one variable are given by "dual Hermite polynomials" (as mentioned in an answer to this math.SE question): $$ E_m[x_1] = m_1, \quad \quad E_m[x_2] = m_2,$$ $$ E_m[x_1^2] = m_1^2+1, \quad \quad E_m[x_2^2] = m_2^2+1,$$ $$ E_m[x_1^3] = m_1^3+3m_1, \quad \quad E_m[x_2^3] = m_2^3+3m_2,$$ $$ E_m[x_1^3] = m_1^4+6m_1^2+3, \quad \quad E_m[x_2^4] = m_2^4+6m_2^2 +3.$$

Compare with the first few (probabilyst's) Hermite polynomials $$ He_0(t) = 1 ,$$ $$ He_1(t) = t ,$$ $$ He_2(t) = t^2 -1, $$ $$ He_3(t) = t^3 - 3t, $$ $$ He_4(t) = t^4 - 6t^2 + 3 .$$

For the record Hermite polynomials are explicitly given by the formula (here the signs alternate) $$He_n(t) = n! \sum_{k=0} ^{\lfloor n/2\rfloor} \frac {(-1)^k t^{n-2k}}{2^kk!(n-2k)!},$$ and the mixed moments of the shifted standard gaussian are $$ E_m[x_1^p x_2^q] = i^{-p-q} He_p(im_1) He_q(im_2),$$ where $i=\sqrt {-1}$ is a choice of imaginary unit.

On the other hand, the operator $\hat E_x[\cdot]$ is given on monomials by $$\hat E_x[m_1^pm_2^q] = E_{-ix}[(im_1)^p(im_2)^q] = He_p(x_1) He_q(x_2).$$

Finally, we have the following property of Hermite polynomials (compare with Remark 4 below): $$ E_m[He_n(x_1)] = m_1^n, \quad \quad E_m[He_n(x_2)] = m_2^n . $$

By linearity, this implies that $E_m$ and $\hat E_x$ are inverse operators, in the following sense: $$ E_m [\hat E_x [f(m_1,m_2)]] = f(m_1,m_2) $$ for every polynomial function $f$.

Remark 4: For the record, the Hermite polynomials form a basis of the set of polynomials. This is explicitly given by the dual summation formula (the signs are all positive here!) $$ t^n = n! \sum_{k=0} ^{\lfloor n/2\rfloor} \frac { He_{n-2k}(t)}{2^kk!(n-2k)!}.$$

Remark 5: To treat more general functions, one may use the following facts:

  1. The shifted gaussians are generating functions of Hermite functions: $$e^{-|x-y|^2/2} = e^{-x^2/2} \sum_{n=0} ^\infty He_k(x) \frac {y^k}{k!};$$

  2. The Hermite polynomials satisfy the following orthogonality relation: $$ \int_{-\infty} ^\infty He_n(t) He_n(t) \frac {e^{-t^2}/2}{k!\sqrt 2\pi} dt = \delta_{k,n}$$

and so $$ \int_{-\infty} ^\infty He_n (x) \frac {e^{-(x-y)^2/2}}{\sqrt {2\pi}} dx = y^n.$$

Luca Ghidelli
  • 1,312
  • 7
  • 14
  • 1
    Can you explain why$ E_1 E[g_2(Z)| M=\mu] - E_2 E[g_1(Z)| M=\mu] =0$ for all $ \mu \in C$ ? – Boby Jun 06 '21 at 20:47
  • Since the two functions have circular symmetry, it means that if we compute the averages with respect to a Gaussian centered in any of the circle, then we get the same results $E_1$ and $E_2$. If you want to see it explicitly, you can write down the change of variables given by the rotation of the plane that sends $\mu$ to $(1,0)$. Now, since the averages of $g_i$ are $E_i$, we see that the expectation of their combinaiton is always $E_1 E_2 - E_2 E_1=0$. The key point here is that the expectation of a function with circular symmetry doesn't change if we rotate the probability measure. – Luca Ghidelli Jun 07 '21 at 05:50
  • 1
    Ok. I agree and I follows this now. But what about the general ellipse? On the ellipse, the phase and the magnitude cannot be decoupled. – Boby Jun 07 '21 at 18:35
  • Hi, sorry that I didn't answer earlier, I have been a bit busy lately. I can find many solutions for ellipses as well, for example using functions of the form "g=h*p ", where h is some gaussian centered at zero (just to make it satisfy the growth condition) and p is some polynomial (because polynomials are related to nth moments of the Gaussian as in your example you used the second moment). These polynomials in general are not "elliptic symmetric", it is something else... – Luca Ghidelli Jun 16 '21 at 13:09
  • I would like to see this topic in more detail, can we discuss about it in https://chat.stackexchange.com/rooms/126537/standard-gaussian-equiweightedness ? – Luca Ghidelli Jun 16 '21 at 13:09
  • Ok. I let me check this out. – Boby Jun 16 '21 at 13:11
  • @Boby: Would you mind checking the answer of Luca below? If it's okay, I will mark it as the accepted answer. – Paresseux Nguyen Jun 17 '21 at 06:46
  • @ParesseuxNguyen Sure. I will go over it. – Boby Jun 17 '21 at 14:19
  • 1
    @Boby: So Luca's answer is amazing, did I understand correctly? – Paresseux Nguyen Jun 18 '21 at 19:05
  • @ParesseuxNguyen Yes, it's a very nice answer. – Boby Jun 20 '21 at 19:09
2

This will not answer the question as stated, but whoever asks such a question may be interested in what I say here.

First, a random variable with a "Gaussian" or "normal" distribution need not have variance $1,$ so I would have stated what the variance is.

In standard terminology, the question is asking which unbiased estimators of zero this parameterized family of probability distributions has (where the ellipse is the parameter space).

A family of probability distributions that admits no unbiased estimators of zero is called "complete", so the question is how far this family is from being complete.

The family of all univariate Gaussian distributions, whose parameter space may be taken to be the half-plane $\{(\mu,\sigma) : \mu\in\mathbb R,\, \sigma>0\},$ is complete, and that fact is essentially the same thing as the one-to-one nature of the two-sided Laplace transform.

A corollary of the completeness of this family of all Gaussian distributions is that with an i.i.d. sample, the sample mean has a smaller variance than every other unbiased estimator of the population mean. For example, think about how to prove that the sample median has a smaller variance than the sample mean. With this result, you get that without computing the variance of the sample median.

Michael Hardy
  • 1
  • 30
  • 276
  • 565
  • Two questions: 1)Do you know if there are `completeness' results for this specific family or a related family in $\mathbb{R}^2$? 2) Can you give a good reference for this? – Boby Jun 20 '21 at 19:09
  • Look at the theory-of-statistics textbooks by de Groot & Schervish and by Casella & Berger. The question points out that this family of distributions is not complete. The answer seems to show that it is not as far from complete as it might have been as far as the original question indicated. – Michael Hardy Jun 20 '21 at 22:40