Why do the upper 32 bits of a uint64_t become one whilst performing a specific bitwise operation?

Question

Can someone please explain to me why the upper 32 bits of a uint64_t are set to one in case number #2:

uint64_t ret = 0;
ret = (((uint64_t)0x00000000000000FF) << 24);
printf("#1 [%016llX]\n", ret);

ret = (0x00000000000000FF << 24);
printf("#2 [%016llX]\n", ret);

uint32_t ret2 = 0;
ret2 = (((uint32_t)0x000000FF) << 8);
printf("#3 [%08X]\n", ret2);

ret2 = (0x000000FF << 8);
printf("#4 [%08X]\n", ret2);

Output:

#1 [00000000FF000000]
#2 [FFFFFFFFFF000000]
#3 [0000FF00]
#4 [0000FF00]

https://ideone.com/xKUaTe

You'll notice I've given an "equivalent" 32bit version (cases #3 and #4) which doesn't show the same behaviour...

I heard once, somewhere, that casting int64's causes misbehavior in a small percentage of 32 bit scenarios. Unity is (or was?) known for this, this is the reason that Kerbal Space Program can't use more than 3.5Gb of RAM. It is also the reason i quit KSP. — x13, Oct 14 '15 at 14:02
Suffix your numeric constants with u or ul or ull: `0xffull` (constants have a type too, numeric constants have the type *signed* int by default, unless a decimal point is present, etc) — wildplasser, Oct 14 '15 at 14:28
@x13 it's unrelated here. The OP just didn't know about C's rule about promotion and type of integer literals. There's no hardware of compiler bug in the above code — phuclv, Sep 28 '18 at 04:04
It is incorrect to use `llX` for printing `uint64_t`, you must use the *macro* `PRIX64`. Likewise for `uint32_t` — Antti Haapala, Apr 07 '19 at 05:11

phuclv · Answer 1 · 2020-10-21T14:20:18.987

By default integer literals without a suffix will have type int if they fit in an int.

The type of the integer constant

The type of the integer constant is the first type in which the value can fit, from the list of types which depends on which numeric base and which integer-suffix was used.

no suffix

decimal bases:

int

long int

unsigned long int (until C99)

long long int (since C99)

binary, octal, or hexadecimal bases:

int

unsigned int

long int

unsigned long int

long long int (since C99)

unsigned long long int (since C99)

...

As a result 0x00000000000000FF will be an int regardless of how many zeros you put in. You can check that by printing sizeof 0x00000000000000FF

Therefore, 0x00000000000000FF << 24 results in 0xFF000000 which is a negative value¹. That'll again be sign extended when casting to uint64_t, filling the top 32 bits with ones

Casting help, as you can see in (uint64_t)0x00000000000000FF) << 24, because now the shift operates on the uint64_t value instead of int. You can also use a suffix

0x00000000000000FFU << 24
0x00000000000000FFULL << 24

The first line above does the shift in unsigned int and then do zero extension to cast to uint64_t. The second one does the operation in unsigned long long directly

0x000000FF << 8 doesn't expose the same behavior because the result is 0xFF00 which doesn't have the sign bit set, but it will if you do (int16_t)0x000000FF << 8

There are a lot of related and duplicate questions:

¹ Technically shifting into the sign bit results in undefined behavior but in your case the compiler has chosen to leave the result the same as when you shift an unsigned value: 0xFFU << 24 = 0xFF000000U which when converted to signed produces a negative value

See

@AnttiHaapala I knew that but I wanted to keep it short. I've added the note now — phuclv, Apr 07 '19 at 07:36

Why do the upper 32 bits of a uint64_t become one whilst performing a specific bitwise operation?

Output:

1 Answers1

The type of the integer constant