84

Consider the following code:

private static void Main(string[] args)
{
    var ar = new double[]
    {
        100
    };

    FillTo(ref ar, 5);
    Console.WriteLine(string.Join(",", ar.Select(a => a.ToString()).ToArray()));
}

public static void FillTo(ref double[] dd, int N)
{
    if (dd.Length >= N)
        return;

    double[] Old = dd;
    double d = double.NaN;
    if (Old.Length > 0)
        d = Old[0];

    dd = new double[N];

    for (int i = 0; i < Old.Length; i++)
    {
        dd[N - Old.Length + i] = Old[i];
    }
    for (int i = 0; i < N - Old.Length; i++)
        dd[i] = d;
}

The result in Debug mode is: 100,100,100,100,100. But in Release mode it is: 100,100,100,100,0.

What is happening?

It was tested using .NET framework 4.7.1 and .NET Core 2.0.0.

Ashkan Nourzadeh
  • 1,732
  • 14
  • 29
  • Which version of Visual Studio (or the compiler) do you use? – Styxxy Dec 01 '17 at 11:01
  • 9
    Repro; adding a `Console.WriteLine(i);` into the final loop (`dd[i] = d;`) "fixes" it, which suggests a compiler bug or JIT bug; looking into the IL... – Marc Gravell Dec 01 '17 at 11:03
  • @Styxxy, tested on vs2015, 2017 and targeted every .net framework >= 4.5 – Ashkan Nourzadeh Dec 01 '17 at 11:03
  • Definitely a bug. It also disappears if you remove `if (dd.Length >= N) return;`, which may be a simpler repro. – Jeroen Mostert Dec 01 '17 at 11:04
  • @MarcGravell even adding Thread.Sleep(1) fixes it in release mode. – Ashkan Nourzadeh Dec 01 '17 at 11:04
  • And I for some reason cannot reproduce - always outputs "100,100,100,100,100" in release mode for me. – Evk Dec 01 '17 at 11:05
  • @Evk: what platform and .NET version? This is likely a jitter bug and thus heavily dependent on what bits you've got. – Jeroen Mostert Dec 01 '17 at 11:06
  • @Evk on this pc tested on vs2015, 2017 and targeted every .net framework >= 4.5, but i got different behavior on other machine by targeting .net framework >= 4.7 – Ashkan Nourzadeh Dec 01 '17 at 11:07
  • @JeroenMostert I have .NET 4.7 installed (not 4.7.1 or 4.7.2) and tried to target 4.5,4.6,4.7. – Evk Dec 01 '17 at 11:08
  • No repro; I have .NET 4.7 installed on my machine. Both Debug and Release mode output same values (5 times 100). Same like @Evk – Styxxy Dec 01 '17 at 11:09
  • Version targeted isn't usually relevant (they'll all use the same JIT compiler), only the installed framework version is. I have 4.7.1, so this may be a regression. – Jeroen Mostert Dec 01 '17 at 11:11
  • Repro using .NET Core 2. – Styxxy Dec 01 '17 at 11:11
  • Only reproducible for me if targeted as x64 – Alex K. Dec 01 '17 at 11:12
  • Unable to reproduce on net47, AnyCPU or x86 or x64 ConsoleApp VS2017 – nvoigt Dec 01 '17 at 11:13
  • 1
    It is not surprising that once the comparison is apples-to-apples, x64 codegen for .Net Framework and .Net Core has similar performance, since (by default) it is the essentially the same jit generating code. It would be interesting to compare the performance of .Net Framework x86 codegen with .Net Core's x86 codegen (which is using RyuJit since 2.0). There are still cases out there where the older jit (aka Jit32) knows a few tricks that RyuJit does not. And if you find any such cases, please make sure to open issues for them over on the CoreCLR repo. – Andy Ayers May 07 '18 at 02:32

2 Answers2

70

This appears to be a JIT bug; I've tested with:

// ... existing code unchanged
for (int i = 0; i < N - Old.Length; i++)
{
    // Console.WriteLine(i); // <== comment/uncomment this line
    dd[i] = d;
}

and adding the Console.WriteLine(i) fixes it. The only IL change is:

// ...
L_0040: ldc.i4.0 
L_0041: stloc.3 
L_0042: br.s L_004d
L_0044: ldarg.0 
L_0045: ldind.ref 
L_0046: ldloc.3 
L_0047: ldloc.1 
L_0048: stelem.r8 
L_0049: ldloc.3 
L_004a: ldc.i4.1 
L_004b: add 
L_004c: stloc.3 
L_004d: ldloc.3 
L_004e: ldarg.1 
L_004f: ldloc.0 
L_0050: ldlen 
L_0051: conv.i4 
L_0052: sub 
L_0053: blt.s L_0044
L_0055: ret 

vs

// ...
L_0040: ldc.i4.0 
L_0041: stloc.3 
L_0042: br.s L_0053
L_0044: ldloc.3 
L_0045: call void [System.Console]System.Console::WriteLine(int32)
L_004a: ldarg.0 
L_004b: ldind.ref 
L_004c: ldloc.3 
L_004d: ldloc.1 
L_004e: stelem.r8 
L_004f: ldloc.3 
L_0050: ldc.i4.1 
L_0051: add 
L_0052: stloc.3 
L_0053: ldloc.3 
L_0054: ldarg.1 
L_0055: ldloc.0 
L_0056: ldlen 
L_0057: conv.i4 
L_0058: sub 
L_0059: blt.s L_0044
L_005b: ret 

which looks exactly right (the only difference is the extra ldloc.3 and call void [System.Console]System.Console::WriteLine(int32), and a different but equivalent target for br.s).

It'll need a JIT fix, I suspect.

Environment:

  • Environment.Version: 4.0.30319.42000
  • <TargetFramework>netcoreapp2.0</TargetFramework>
  • VS: 15.5.0 Preview 5.0
  • dotnet --version: 2.1.1
Marc Gravell
  • 927,783
  • 236
  • 2,422
  • 2,784
6

It's an assembly error indeed. x64, .net 4.7.1, release build.

disassembly:

            for(int i = 0; i < N - Old.Length; i++)
00007FF942690ADD  xor         eax,eax  
            for(int i = 0; i < N - Old.Length; i++)
00007FF942690ADF  mov         ebx,esi  
00007FF942690AE1  sub         ebx,ebp  
00007FF942690AE3  test        ebx,ebx  
00007FF942690AE5  jle         00007FF942690AFF  
                dd[i] = d;
00007FF942690AE7  mov         rdx,qword ptr [rdi]  
00007FF942690AEA  cmp         eax,dword ptr [rdx+8]  
00007FF942690AED  jae         00007FF942690B11  
00007FF942690AEF  movsxd      rcx,eax  
00007FF942690AF2  vmovsd      qword ptr [rdx+rcx*8+10h],xmm6  
            for(int i = 0; i < N - Old.Length; i++)
00007FF942690AF9  inc         eax  
00007FF942690AFB  cmp         ebx,eax  
00007FF942690AFD  jg          00007FF942690AE7  
00007FF942690AFF  vmovaps     xmm6,xmmword ptr [rsp+20h]  
00007FF942690B06  add         rsp,30h  
00007FF942690B0A  pop         rbx  
00007FF942690B0B  pop         rbp  
00007FF942690B0C  pop         rsi  
00007FF942690B0D  pop         rdi  
00007FF942690B0E  pop         r14  
00007FF942690B10  ret  

The issue is at address 00007FF942690AFD, the jg 00007FF942690AE7. It jumps back if ebx (which contains 4, the loop end value) is bigger (jg) than eax, the value i. This fails when it's 4 of course, so it doesn't write the last element in the array.

It fails, because it inc's i's register value (eax, at 0x00007FF942690AF9), and then checks it with 4, but it still has to write that value. It's a bit hard to pinpoint where exactly the issue is located, as it looks like it might be the result of the optimization of (N-Old.Length), as the debug build contains that code, but the release build precalculates that. So that's for the jit people to fix ;)

Frans Bouma
  • 8,022
  • 1
  • 25
  • 26
  • 2
    One of these days I need to carve out some time to learn assembly / CPU opcodes. Perhaps naively I keep thinking "meh, I can read and write IL - I should be able to grok it" - but I just never get around to it :) – Marc Gravell Dec 06 '17 at 15:35
  • x64/x86 isn't the greatest assembly language to start with tho ;) It has so many opcodes, I once read there's noone alive who knows them all. Not sure if that's true, but it's not that easy to read it at first. Though it does use a few simple conventions, like the [], the destination before source part and what these registers all mean (al is 8bit part of rax, eax is 32bit part of rax etc). You can step through it in vs tho, which should teach you the essentials. I'm sure you pick it up quickly as you already know IL opcodes ;) – Frans Bouma Dec 06 '17 at 15:41