0

I was looking at the assembly of a code snippet I wrote and noticed that the indicated movsxd op only appears if the ret variable is 32-bit. If ret is 64-bit, it is used directly: mov BYTE PTR [rdi+rbp+86], al.

; 861  :        _BitScanForward(&depth, subject);

    movsx   edx, dx

; 862  :        qry_args->lo_refs[++ret] = (BYTE)depth;

    inc ebp                             // ret is in ebp
    bsf ecx, edx
    movsxd  rax, ebp                    // convert 32-bit ebp to 64-bit rax

; 865  :        subject ^= (1 << depth);
; 866  :        nulls_mask.lo |= (1 << depth);

    movsx   r9d, r9w
    btc edx, ecx
    bts r9d, ecx
    mov BYTE PTR [rax+rbx+86], cl       // 64-bit rax used by mov

Since the mov op requires 64-bit registers in 64-bit mode, it makes sense to me that any variables used to reference memory (such as array referencers) should ideally be in 64-bit.

However, I know it's common to simply use int in a loop that is not going to exceed 2^^31 iterations. Should we in fact be using long long (int64) as a best practice for 64-bit code? Any comments on this?

I haven't gone to the trouble of testing this beyond what is shown here.

Ps. This isn't a micro-optimization question. It's a question of form. To me, it makes sense to use the type used by the compiler.

Info: I'm compiling with VS 2016 with max optimizations on.

IamIC
  • 16,207
  • 18
  • 81
  • 142
  • "_variables used to reference memory (such as array referencers)_" are by your writing pointers and so must by definition be 64 bit in a 64-bit memory organization. `int` is the native size of machine registers. – Paul Ogilvie Oct 03 '16 at 11:23
  • `int` is 4 bytes wide. https://msdn.microsoft.com/en-us/library/9c3yd98k.aspx https://www.google.com.tw/webhp#q=size%20of%20int%20in%20c – IamIC Oct 03 '16 at 11:32

2 Answers2

4

Use the size_t type for array indexes. It is large enough to hold array indexes. Usually it holds 64 bits on 64-bit platforms and 32 bits on 32-bit platforms.

See https://stackoverflow.com/a/2550799/909655

Community
  • 1
  • 1
Mats
  • 8,136
  • 1
  • 24
  • 34
  • As a point of interest, correcting the code to use size_t reduced the number of ASM ops by slightly over 1%. That's quite telling. – IamIC Oct 03 '16 at 13:30
3

It is generally not a good idea to use long long in your case. The next developer to read your code will think that either the code needs to handle large numbers or that the original programmer didn't know what he was doing.

Better to use either size_t, indicating that the variable should be able to handle any array size, or int, indicating that it is a general purpose variable with normal range requirements.

What integer type should I choose?

int is used for normal integer variables. This is the type you should use unless there is a reason to choose another type. int has a size that has been chosen by the platform developer because it is a good size to use (for whatever reason, but typically a good tradeoff between range, memory consumption and performance on that platform)

char is used for strings and binary data. If you plan to use binary operators (especially shift operators) you should use unsigned char.

size_t Used for array/memory sizes, array indexes, etc.

Other int sizes (short, long, long long, fixed size) are used as needed. Fixed size is typically used for data that is interchanged between different systems. long/short are typically used when the return value of a standard function is of the respective size. long long is used when you need to store large numbers, but for really big integers you need a BigInt library.

Klas Lindbäck
  • 32,158
  • 4
  • 51
  • 77
  • As a point of interest, correcting the code to use size_t reduced the number of ASM ops by slightly over 1%. That's quite telling. – IamIC Oct 03 '16 at 13:30
  • @IamIC I usually optimize for code readability first, not performance. Performance optimization is usually only needed if you deal with performance-critical code. https://en.wikipedia.org/wiki/Program_optimization#When_to_optimize – Klas Lindbäck Oct 03 '16 at 14:05
  • 1
    I wanted to write code that follows best practices. Doing so (i.e. using `size_t`) just happened to speed things up slightly. Having said that, this code is extremely performance critical, so I'm not complaining. – IamIC Oct 03 '16 at 15:58