0

I am trying to run some test of convergence of random
The average of 0, 1, 2 is 1 so the long run I would expect value minus 1 to go to zero
It is not going to zero - even with Unit16.max
I am getting like off by 300

Am I not running a big enough sample? Or is random off?
Or is something wrong with my program?

// private static System.Security.Cryptography.RNGCryptoServiceProvider rngCsp = new System.Security.Cryptography.RNGCryptoServiceProvider();
// this was not better than Random

Random ran = new Random();
public void RollEm()
{
    int sum = 0;
    int[] sums = new int[3];
    int val;
    int minVal = 0;
    int maxVal = 0;
    for (UInt32 i = 0; i < Int16.MaxValue; i++)
    {
        val = ran.Next(3);
        //val = (int)RollDice(3) - 1;
        sums[val]++;
        sum += val;
        sum--;
        if (sum < minVal)
            minVal = sum;
        if (sum > maxVal)
            maxVal = sum;

        if(i % 100 == 0)
            System.Diagnostics.Debug.WriteLine("i {0}  sum {1}  val {2}", i, sum, val);
        //System.Threading.Thread.Sleep(10);
    }
    System.Diagnostics.Debug.WriteLine("sum {0}   min {1}  max {2}", sum, minVal, maxVal);
    System.Diagnostics.Debug.WriteLine("sums {0}  {1}   {2}", sums[0], sums[1], sums[2]);
}

I agree with the comments and answers about average not sum should go to zero. I am using this for poker bankroll analysis so sum is what mattered to me.

paparazzo
  • 42,665
  • 20
  • 93
  • 158
  • Random generates Int32 numbers. `Next(3)` maps that to a range of 0-3 [using double arithmetic](http://referencesource.microsoft.com/#mscorlib/system/random.cs,186). You can get that discrepancy simply due to rounding errors – Panagiotis Kanavos Aug 10 '16 at 11:54
  • @PanagiotisKanavos Next(3) is supposed to map to 0,1,2 – paparazzo Aug 10 '16 at 11:56
  • Which it does by converting that Int32 to a double between 0..1 and multiplying by 3. That is guaranteed to generate scaling errors during division and multiplication *and* rounding errors when converting the doulbe back to an int. I put the link to the source in the previous comment – Panagiotis Kanavos Aug 10 '16 at 11:58
  • @Paparazzi: What they are saying is that because it's expanding a floating-point number to that range, you have some values that are biased over the others, since the number of possible floating-point values is not evenly divisible by three. See, e.g. [this answer](http://stackoverflow.com/a/11758872/73070) for an explanation. So yes, it maps to [0, 3), but not uniformly so. – Joey Aug 10 '16 at 11:59
  • @RenéVogt Yes, I am geting a sum of the deviation from the expected value. That is my intent. – paparazzo Aug 10 '16 at 12:06
  • @Joey Refer to the documentation. Next(3) returns [0,1,2] – paparazzo Aug 10 '16 at 12:07
  • 1
    @Paparazzi ok, but that sum is not a value to indicate the accuracy of `Random`. The average is a better criteria and the [standard deviation](https://en.wikipedia.org/wiki/Standard_deviation) – René Vogt Aug 10 '16 at 12:09
  • The sum should grow like the square root of the number of samples. – CodesInChaos Aug 11 '16 at 16:21
  • @CodesInChaos What is you basis for that? – paparazzo Aug 11 '16 at 16:23

2 Answers2

2

I made some tests with that:

Console.WriteLine(Enumerable.Range(0, Int16.MaxValue).Average(i => r.Next(3) - 1));

Three consecutive calls gives this output:

-0,0075991088595233
0,00729392376476333
-0,00524918362987152

So you see the average is almost 0.

But three consecutive calls to

Console.WriteLine(Enumerable.Range(0, Int16.MaxValue).Sum(i => r.Next(3) - 1));

gives

38
133
146

This means that in Int16.MaxValue random numbers there have only been e.g. 133 more +1 than -1. Thatswhy the average is not exactly 0 but a little off.

So the "mistake" in your code is that you are checking the sum instead of the average. You need to divide the 300 you are off by by Int16.MaxValue.

You probably thought that substracting 1 would lead to an average as +1 and -1 should cancel each other, but when you think about it: if you worked with the values 1 to 3, your sum would not be 2 but a really great value.

René Vogt
  • 40,163
  • 14
  • 65
  • 85
  • I hear you but I would expect the sum to go to zero over a large number as I am subtracting the average each run sum--;. Nice use of enumerable. – paparazzo Aug 10 '16 at 12:01
  • As your code reads now, you are subtracting 1 from sum in each iteration, so you effectivly substract `Int16.MaxValue` from your sum. In fact, the value you are off should increase with the number of iterations. If you have 1,000,000,000,000,000,000 iteration it would be very improbable that there are _exactly_ 1/3 `-1`, 1/3 `0` and 1/3 `+1`. The mistake would be much higher than only 300. – René Vogt Aug 10 '16 at 12:04
  • @Paparazzi 65K isn't a large number. I suggest you use one of the LINQ one-liners used in all these answers with the aggregate function you want (Count, average, stddev through Math.Net) and check with larger numbers - even 65M (ie 65K * 100) takes seconds. You'll see that the diference is small and constantly changing. The built-in Random algorithm isn't great but it's not that bad either – Panagiotis Kanavos Aug 10 '16 at 12:04
  • @PanagiotisKanavos I let it go go Int32 and it was not much better. – paparazzo Aug 10 '16 at 12:12
  • @Paparazzi it won't get better, that's basic stochastics. Only the average should go to zero, but not your sum. As I said, I would even expect that sum to increase with more iterations. – René Vogt Aug 10 '16 at 12:15
  • @RenéVogt OK I am agreeing that avg is a better way of looking at it. – paparazzo Aug 10 '16 at 12:19
0

No, Random is not off. It just means that over the whole 65536 numbers you summed, about 300 more were 2 than 0. That's actually not that many, compared to the total number of dice rolls you did there. The arithmetic mean of all the numbers you rolled is only guaranteed to hit 1, if you do infinite dice rolls. You already have an average of 1 accurate to two decimal places, which certainly is good enough here.

Joey
  • 316,376
  • 76
  • 642
  • 652