5

I'm profiling some C# code. The method below is one of the most expensive ones. For the purpose of this question, assume that micro-optimization is the right thing to do. Is there an approach to improve performance of this method?

Changing the input parameter to p to ulong[] would create a macro inefficiency.

static ulong Fetch64(byte[] p, int ofs = 0)
{
    unchecked
    {
        ulong result = p[0 + ofs] + 
            ((ulong) p[1 + ofs] <<  8) + 
            ((ulong) p[2 + ofs] << 16) + 
            ((ulong) p[3 + ofs] << 24) + 
            ((ulong) p[4 + ofs] << 32) + 
            ((ulong) p[5 + ofs] << 40) + 
            ((ulong) p[6 + ofs] << 48) + 
            ((ulong) p[7 + ofs] << 56);
        return result;
    }
}
casperOne
  • 70,959
  • 17
  • 175
  • 239
Eric J.
  • 139,555
  • 58
  • 313
  • 529
  • 3
    Looks like BitConverter.ToInt64 - http://msdn.microsoft.com/en-us/library/system.bitconverter.toint64.aspx ? – Alexei Levenkov Nov 12 '11 at 01:43
  • Read and shift a few bytes - is that honestly expensive? I'm sure you're calling it lots but I'd be surprised if the compiler could go much wrong with that – Rup Nov 12 '11 at 01:44
  • 1
    @Alexei ToUInt64, but yeah. If you meant use that instead, post that as an answer? (Or does Eric want to optimise BitConverter even?) – Rup Nov 12 '11 at 01:45
  • 1
    @Rup: BitConverter was *much* faster than my code. – Eric J. Jan 16 '13 at 15:30

4 Answers4

5

Why not use BitConverter? I've got to believe the Microsoft has spent some time tuning that code. Plus it deals with endian issues.

Here's how BitConverter turns a byte[] into a long/ulong (ulong converts it as signed and then casts it to unsigned):

[SecuritySafeCritical]
public static unsafe long ToInt64(byte[] value, int startIndex)
{
  if (value == null)
  {
    ThrowHelper.ThrowArgumentNullException(ExceptionArgument.value);
  }
  if (((ulong) startIndex) >= value.Length)
  {
    ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.startIndex, ExceptionResource.ArgumentOutOfRange_Index);
  }
  if (startIndex > (value.Length - 8))
  {
    ThrowHelper.ThrowArgumentException(ExceptionResource.Arg_ArrayPlusOffTooSmall);
  }
  fixed (byte* numRef = &(value[startIndex]))
  {
    if ((startIndex % 8) == 0)
    {
      return *(((long*) numRef));
    }
    if (IsLittleEndian)
    {
      int num  = ((numRef[0] | (numRef[1] << 8)) | (numRef[2] << 0x10)) | (numRef[3] << 0x18);
      int num2 = ((numRef[4] | (numRef[5] << 8)) | (numRef[6] << 0x10)) | (numRef[7] << 0x18);
      return (((long) ((ulong) num)) | (num2 << 0x20));
    }
    int num3 = (((numRef[0] << 0x18) | (numRef[1] << 0x10)) | (numRef[2] << 8)) | numRef[3];
    int num4 = (((numRef[4] << 0x18) | (numRef[5] << 0x10)) | (numRef[6] << 8)) | numRef[7];
    return (((long) ((ulong) num4)) | (num3 << 0x20));
  }
}

I suspect that doing the conversion one 32-bit word at a time is for 32-bit efficiency. No 64-bit registers on a 32-bit CPU means dealing with a 64-bit ints is a lot more expensive.

If you know for sure you're targeting 64-bit hardware, it might be faster to do do the conversion in one fell swoop.

Nicholas Carey
  • 60,260
  • 12
  • 84
  • 126
  • It's also worth noting that this is using unsafe code to improve performance; seems like the BCL method is the best bet. – Dan Bryant Nov 12 '11 at 03:04
  • D'oh! Used BitConverter in a different context today and didn't think of that. BTW, this optimization improved the overall performance of my C# port of CityHash by 30% (making it now 28% faster than the C++ version I ported from). – Eric J. Nov 12 '11 at 03:46
2

Try to use for instead of unrolling the loop. You may be able to save time on boundary checks.

Try BitConverter.ToUInt64 - http://msdn.microsoft.com/en-us/library/system.bitconverter.touint64.aspx if it is what you looking for.

Alexei Levenkov
  • 94,391
  • 12
  • 114
  • 159
1

For reference, Microsoft's .NET 4.0 BitConverter.ToInt64 (Shared Source Initiative at http://referencesource.microsoft.com/netframework.aspx):

    // Converts an array of bytes into a long.
    [System.Security.SecuritySafeCritical]  // auto-generated 
    public static unsafe long ToInt64 (byte[] value, int startIndex) {
        if( value == null)  {
            ThrowHelper.ThrowArgumentNullException(ExceptionArgument.value);
        } 

        if ((uint) startIndex >= value.Length) { 
            ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.startIndex, ExceptionResource.ArgumentOutOfRange_Index); 
        }

        if (startIndex > value.Length -8) {
            ThrowHelper.ThrowArgumentException(ExceptionResource.Arg_ArrayPlusOffTooSmall);
        }

        fixed( byte * pbyte = &value[startIndex]) {
            if( startIndex % 8 == 0) { // data is aligned 
                return *((long *) pbyte); 
            }
            else { 
                if( IsLittleEndian) {
                    int i1 = (*pbyte) | (*(pbyte + 1) << 8)  | (*(pbyte + 2) << 16) | (*(pbyte + 3) << 24);
                    int i2  = (*(pbyte+4)) | (*(pbyte + 5) << 8)  | (*(pbyte + 6) << 16) | (*(pbyte + 7) << 24);
                    return (uint)i1 | ((long)i2 << 32); 
                }
                else { 
                    int i1 = (*pbyte << 24) | (*(pbyte + 1) << 16)  | (*(pbyte + 2) << 8) | (*(pbyte + 3)); 
                    int i2  = (*(pbyte+4) << 24) | (*(pbyte + 5) << 16)  | (*(pbyte + 6) << 8) | (*(pbyte + 7));
                    return (uint)i2 | ((long)i1 << 32); 
                }
            }
        }
    } 
AMissico
  • 20,847
  • 6
  • 69
  • 105
1

Why not go unsafe?

unsafe static ulong Fetch64(byte[] p, int ofs = 0)
{
  fixed (byte* bp = p)
  {
    return *((ulong*)(bp + ofs));
  }
}
leppie
  • 109,129
  • 16
  • 185
  • 292