36

I want to generate random numbers with a range (n to m, eg 100 to 150), but instead of purely random I want the results to be based on the normal distribution.

By this I mean that in general I want the numbers "clustered" around 125.

I've found this random number package that seems to have a lot of what I need: http://codeproject.com/KB/recipes/Random.aspx

It supports a variety of random generators (include mersiene twister) and can apply the generator to a distribution.

But I'm confused, if I use a normal distribution generator the random numbers are from roughly -6 to +8 (apparently the true range is float.min to float.max).

How do a scale that to my required range?

Peter O.
  • 28,965
  • 14
  • 72
  • 87
ConfusedAgain
  • 363
  • 1
  • 3
  • 4

6 Answers6

29

A standard normal distribution has mean 0 and standard deviation of 1; if you want to make a distribution with mean m and deviation s, simply multiply by s and then add m. Since the normal distribution is theoretically infinite, you can't have a hard cap on your range e.g. (100 to 150) without explicitly rejecting numbers that fall outside of it, but with an appropriate choice of deviation you can be assured that (e.g.) 99% of your numbers will be within the range.

About 99.7% of a population is within +/- 3 standard deviations, so if you pick yours to be about (25/3), it should work well.

So you want something like: (normal * 8.333) + 125

tzaman
  • 42,181
  • 9
  • 84
  • 108
14

For the sake of interest, it's pretty straightforward to generate normally distributed random numbers from a uniform RNG (though it must be done in pairs):

Random rng = new Random();
double r = Math.Sqrt(-2 * Math.Log(rng.NextDouble()));
double θ = 2 * Math.Pi * rng.NextDouble();
double x = r * Math.Cos(θ);
double y = r * Math.Sin(θ);

x and y now contain two independent, normally distributed random numbers with mean 0 and variance 1. You can scale and translate them as necessary to get the range you want (as interjay explains).


Explanation:

This method is called the Box–Muller transform. It uses the property of the two-dimensional unit Gaussian that the density value itself, p = exp(-r^2/2), is uniformly distributed between 0 and 1 (normalisation constant removed for simplicity).

Since you can generate such a value easily using a uniform RNG, you end up with a circular contour of radius r = sqrt(-2 * log(p)). You can then generate a second uniform random variate between 0 and 2*pi to give you an angle θ that defines a unique point on your circular contour. Finally, you can generate two i.i.d. normal random variates by transforming from polar coordinates (r, θ) back into cartesian coordinates (x, y).

This property – that p is uniformly distributed – doesn't hold for other dimensionalities, which is why you have to generate exactly two normal variates at a time.

Will Vousden
  • 29,947
  • 9
  • 78
  • 91
  • Interesting. Does this method have a name? I'd like to read more about this. This is an approximation, right? – Drew Noakes Jan 02 '11 at 08:42
  • 1
    @Drew: It's called the Box-Muller transform: http://en.wikipedia.org/wiki/Box%E2%80%93Muller_transform – Will Vousden Jan 02 '11 at 09:16
  • 1
    Thanks. I noted here (http://stackoverflow.com/questions/2325472/generate-random-numbers-following-a-normal-distribution-in-c-c/2325531#2325531) the suggestion that you may hold onto `u2` and use it as `u1` for the following call, as an optimisation. There's no mention of this on the Wikipedia article. Can you comment as to whether this maintains randomness? – Drew Noakes Jan 03 '11 at 05:32
  • 1
    The implementation of Java's `Random.nextGaussian()` looks more like what the other answer was referring to http://download.oracle.com/javase/1.4.2/docs/api/java/util/Random.html#nextGaussian() -- it also claims to be a Box-Muller transform, but the code looks quite different to that in your answer. Are you showing the Cartesian approach? – Drew Noakes Jan 03 '11 at 05:40
  • Drew: From what I understand, the two Gaussian randoms generated are completely independent, but if you need to be sure, you should check out a better source. :-) – jvriesem Feb 06 '14 at 02:49
  • @jvriesem Yep, they the Box-Muller transform maps a pair of independent uniformly distributed random numbers to a pair of independent normally distributed random numbers. – Will Vousden Feb 06 '14 at 09:43
  • @DrewNoakes This is a bit old now, but yes, randomness is maintained, since the samples are independent of one another. Basically, the transform uses the property of the two-dimensional Gaussian that the density value itself is uniformly distributed. This doesn't hold for other dimensionalities, which is why you have to generate exactly two values at a time. – Will Vousden Feb 06 '14 at 09:46
  • Note that the Box-Muller transform truncates the tails of the distribution. – Richard May 26 '16 at 15:16
4

tzaman's answer is correct, but when using the library you linked there is a simpler way than performing the calculation yourself: The NormalDistribution object has writable properties Mu (meaning the mean) and Sigma (standard deviation). So going by tzaman's numbers, set Mu to 125 and Sigma to 8.333.

interjay
  • 97,531
  • 20
  • 242
  • 238
2

This may be too simplistic for your needs, but a quick & cheap way to get a random number with a distribution that's weighted toward the center is to simply add 2 (or more) random numbers.

Think of when you roll two 6-sided dice and add them. The sum is most often 7, then 6 and 8, then 5 and 9, etc. and only rarely 2 or 12.

dkamins
  • 20,091
  • 6
  • 48
  • 57
  • Central Limit Theorem means that adding uniforms will approximate a normal distribution, but it's hackish, and hard to keep track of the variance. – tzaman May 01 '10 at 23:48
  • 1
    It also requires taking an arbitrarily large number of samples for a given approximation to a normal distribution. – Will Vousden May 02 '10 at 00:01
0

Here's an other algoritm that doesn't need to calculate Sin/Cos, nor does it need to know Pi. Don't ask me about the theoretical background. I've found it somewhere once and it's what I've been using since. I suspect it's some kind of normalisation of the same Box-Muller transform that @Will Vousden mentions. It also produces results in pairs.

The example is VBscript; easy enough to convert into any other language.

Sub calcRandomGauss (byref y1, byref y2)
    Dim x1, x2, w
    Do
        x1 = 2.0 * Rnd() - 1.0
        x2 = 2.0 * Rnd() - 1.0
        w = x1 * x1 + x2 * x2
    Loop While w >= 1.0 Or w = 0  'edited this line, thanks Richard

    w = Sqr((-2.0 * Log(w)) / w )
    y1 = x1 * w
    y2 = x2 * w
End Sub
mgr326639
  • 856
  • 1
  • 10
  • 22
  • 1
    "Don't ask me about the theoretical background. I've found it somewhere once". That's how incorrect knowledge propagates. This is a bad implementation of the Marsaglia polar method. You need to loop while `w>=1.0 OR w==0`. Otherwise, you risk taking `log(0)` and blowing your program up. – Richard May 26 '16 at 15:22
  • 1
    I'd say on the contrary. Being honest about not having any reference was meant to prevent incorrect knowledge from propagating. Thanks for the reference and the correction. – mgr326639 May 27 '16 at 22:17
0

A different approach to this problem uses the beta distribution (which does have a hard range, unlike the normal distribution) and involves choosing the appropriate parameters such that the distribution has the given mean and standard deviation (square root of variance). See this question.

Peter O.
  • 28,965
  • 14
  • 72
  • 87