I'm doing some thesis work on lattices, but probability theory is not my strong suit and I'm not sure how to solve this problem:

I have vectors $\mathbf{a}_i, \mathbf{x} \in \mathbb{R}^n$, and scalars $b_i \in \mathbb{R}$ for $0 < i \leq k$. Each scalar and each coordinate in each vector is drawn from a $N(0,\sigma^2)$ distribution.

Now I have $k$ random variables:

$$X_i = \mathbf{a}_i \cdot \mathbf{x} + b_i$$

where $\cdot$ is the scalar product. I am interested in

$$Y = \sqrt{ \sum_{i = 1}^k X_i^2 } .$$

Specifically, I would like to know the probability that $Y$ exceeds a certain bound $L$.

My thinking so far:

As far as I can tell $X_i \sim N(0, n\sigma^4 + \sigma^2)$ since it is the sum of $n$ products of two Gaussian variables and 1 Gaussian variable ($n$ is large, so we apply the central limit theorem). This lead me to the chi-square distribution with a non-standard sigma such in this question here, which appears to mean $Y$ would follow a gamma distribution.

However, the $X_i$ are not independent because of the shared $\mathbf{x}$, are they? So I found this, which appears to be what I'm looking for (let me know if I'm wrong). Now I'm having trouble with the covariance matrix.

$$\mathrm{Cov}[X_i,X_j] = \mathrm{E}[X_i X_j] = \mathrm{E}[(\mathbf{a}_i \cdot \mathbf{x} + b_i)(\mathbf{a}_j \cdot \mathbf{x} + b_j)] $$

When I work out the sum I get a huge mess, but what stands out is that you get products between the components of $\mathbf{a}_i, \mathbf{a}_j$, and $\mathbf{x}$, but only the components of $\mathbf{x}$ appear more than once (squared) in the same term. Since all these components are independent and have mean 0, is the expectation value of each term without a squared component 0? However,

$$ \mathrm{E}[a_{i1} a_{j1} x_1^2] = \mathrm{E}[a_{i1}] \mathrm{E}[a_{j1}]\mathrm{E}[x_1^2] = 0$$

right? So am I correct in thinking that the $X_i$ can:

  1. Be seen as a multivariate normal distribution?

  2. Are dependent but happen to have a covariance of 0? So that my covariance matrix $\Sigma$ is a diagonal matrix where all of the elements are equal to $n\sigma^4 + \sigma^2$.

Now I'm unable to entirely follow the answer in the second link I gave. $\Sigma$ is already diagonal, so that should simplify things quite a bit. I can't tell what the $\Lambda$ becomes. I think the diagonal elements become $\lambda_i = n\sigma^4 + \sigma^2$ (really not sure about this part)? So the quadratic form is a linear combination of $k$ independent chi-square variables with 1 degree of freedom. So for $X = (X_1,...,X_k)$:

$$Y^2 = Q(X) = \sum^k_{i = 1} \lambda_i (N(0,1))^2 = (n\sigma^4 + \sigma^2) \sum^k_{i = 1} (N(0,1))^2$$

This feels strange to me. Am I on the right path?

  • 204
  • 1
  • 5
  • Normal distributions can take on non-integer values, so it actually doesn't make any sense to say that your scalars and coordinates are drawn from a normal distribution. – Clarinetist Jun 15 '16 at 20:26
  • @Clarinetist Ah yes, you're right. In practice I'm using the discrete binomial distribution, but with large sample size. I'll edit the question to change Z to R. – bkjvbx Jun 15 '16 at 21:54
  • The difficulty here is working with the dot product $\mathbf{a}_i \cdot \mathbf{x}$. I would suggest looking at [this](http://math.stackexchange.com/a/397716/81560). When you add $b_i$ to this $\chi^2$ random variable, I'm not sure what that results in. – Clarinetist Jun 16 '16 at 11:48
  • Personally, I think this is a good situation for a simulation, but I imagine that this work is much more theoretically focused. – Clarinetist Jun 16 '16 at 11:48
  • @Clarinetist Hmm, I see what you mean. The use of the classical central limit theorem there is not valid, and I don't have the knowledge to evaluate the conditions for the other ones. I think you're right, a simulation is probably a good idea. For the sake of this problem, though, assuming $X_i$ is indeed normal, is the rest of the logic sound? – bkjvbx Jun 16 '16 at 12:49
  • Now there's where it gets - still - quite difficult. The $X_i$ aren't necessarily independent, so $\sum_i X_i^2$ isn't going to lead anything very nice - if they were independent, the sum would be a $\chi^2_k$ random variable, and the problem would essentially be equivalent to finding the probability that $Y > L$, from which you would have to consider what happens in $(-1, 1)$ and elsewhere in $\mathbb{R}$. – Clarinetist Jun 16 '16 at 13:28
  • [This](http://math.stackexchange.com/questions/442472/sum-of-squares-of-dependent-gaussian-random-variables) *might* be helpful. – Clarinetist Jun 16 '16 at 13:31
  • @Clarinetist That's the second link I linked, and what I tried to follow in my calculations :) I ended up with 0 covariance between the $X_i$, made it a much simpler case of what's handled in the link. I end up with $Y^2 \sim \Gamma(k/2, 2n\sigma^4 + 2\sigma^2)$ (assuming $X_i$ is normal). Which ends up being the same case as $X_i$ being independent. But I'm not sure if the reasoning is valid. I'll probably just have to do a simulation like you suggested. In any case, thanks for your feedback! I'd upvote all your comments but unfortunately don't have the privileges. – bkjvbx Jun 17 '16 at 09:05

0 Answers0