0

Imagine the following situation:

An array of 4 elements of type uint8_t is given that represent 32 bit integer, byte by byte. The goal is to address the whole array as a 32 bit integer.

int main( void )
{
    uint8_t array[4] = { 0, 0, 0, 12 };
    uint32_t * ptr = ( uint32_t * )array;
    printf("%d", *ptr);
    return 0;
}

Forget about endianess for now, it is not relevant to the question (I do not think).

Now the C standard says that casting a pointer to a type with more strict alignment is undefined behavior.

Some examples of compliant and non-compliant code are given here: https://wiki.sei.cmu.edu/confluence/display/c/EXP36-C.+Do+not+cast+pointers+into+more+strictly+aligned+pointer+types

The code above compiles and gives expected result on latest GCC and IAR compilers I've tried.

The question is whether this code is safe in general?

I imagine the following situation on the architecture where integer type is self-aligned. My logic is that since array will inherit the alignment specification of its most strict type - char it can be placed anywhere in memory. For example:

Memory | Value | integer can start here
....           
0x20   |       | yes
0x21   | 0     | no   <- array begins, uint32_t ptr
0x22   | 0     | no
0x23   | 0     | no
0x24   | 12    | yes
....

In this scenario if we dereference a uint32_t pointer we may potentially crash on some architectures.

Did I miss something here? Obviously this code works on major compilers and I would imagine that cases where it fails are very specific, more related to legacy architectures and compilers. Nevertheless, is such code safe and portable as according to C standard?

If I made incorrect assumptions or interpreted something incorrectly, please let me know.

keyermoond
  • 25
  • 4
  • 1
    This code also violates the strict aliasing rule. If the compiler attempts to optimize code that violates strict aliasing, the code may not work as expected. – user3386109 Oct 04 '18 at 17:05
  • Another note is that your assumption that `uint8_t` is identical to `unsigned char` is not portable. That's a property of your system as well. – Frank Oct 04 '18 at 17:06
  • 2
    This is undefined behavior and does not necessarily work on gcc. It is known to create bugs for code like this, when optimization is enabled. Embedded systems compilers generally don't optimize based on strict aliasing, so IAR will work even though there's no guarantee by the C standard. – Lundin Oct 04 '18 at 17:09
  • @chux you are right, fixed, thank you for pointing this out – keyermoond Oct 04 '18 at 17:25
  • To be clear, this code has 2 problems: 1) violates the strict aliasing rule 2) Alignment not necessarily correct. Both are UB. A larger question is why even attempt this vs. other better ways to achieve "goal is to address the whole array as a 32 bit integer."? That should have been the title question. Research `union`. – chux - Reinstate Monica Oct 04 '18 at 17:35
  • @Frank `uint8_t` *is* identical in representation to `unsigned char`. – Antti Haapala Oct 04 '18 at 17:36
  • @Antti Haapala only if `CHAR_BIT == 8` – Frank Oct 04 '18 at 17:37
  • 1
    @Frank and if `CHAR_BIT != 8` the code does not compile because there is no `uint8_t`... – Antti Haapala Oct 04 '18 at 17:38
  • @chux this question is a result of an argument I had with co-workers, the goal was to analyze this particular piece of code – keyermoond Oct 04 '18 at 17:42
  • In that case, it may be worthwhile to point out that the code is also not portable due to endian issues. On a big endian system, the code would print 12. On a little endian system, the code would print 201326592. (That's assuming that we're ignoring the alignment and strict aliasing problems.) – user3386109 Oct 04 '18 at 17:53
  • @user3386109 *If the compiler attempts to optimize code that violates strict aliasing, the code may not work as expected* It's worse than that, because in general, no optimization is required for the posted code to fail. – Andrew Henle Oct 04 '18 at 18:54

1 Answers1

1

In this scenario if we dereference a uint32_t pointer we may potentially crash on some architectures.

This is correct. No, this code is not safe and it is not portable for exactly the reasons you described.

melpomene
  • 79,257
  • 6
  • 70
  • 127
  • An array will not likely be allocated at a misaligned address. It's just when you do `(uint32_t*)&arr[i]` that alignment is a problem. The main issue here is the strict aliasing rule. – Lundin Oct 04 '18 at 17:07
  • 1
    @Lundin "likely" is not good enough when portable code is the goal. – Frank Oct 04 '18 at 17:11
  • 1
    @Frank There is no incentive for the compiler to allocate a character array misaligned, so why would it? – Lundin Oct 04 '18 at 17:13
  • @Lundin, sure there is. Packed structs are a trivial example of that. Architectures with very shallow stacks may also want to do that as well. There are plenty of reasons. – Frank Oct 04 '18 at 17:14
  • @Frank I'm speaking of the code in the question, which does not mention structs. Structs are different since they are guaranteed to be allocated at an aligned address, although their members may not be aligned. Architectures with shallow stacks (ie 8-bitters) don't even have alignment. – Lundin Oct 04 '18 at 17:20
  • @Lundin [Here's a counterexample](https://tio.run/##S9ZNT07@/185My85pzQlVcGmuCQlM18vw46LKzOvRCE3MTNPoyw/M0VToZpLAQiSMxKLFJIVbBVMjKzBAqV5xZnpeakpEJnEoqLEyuhYoHy1goEOFBkaKdRCFBcUAQ1N01BSLYjJU9JRAJusoKUJ1qVpzVX7/z8A). The array is at `0x7ffdbbcdf44b`, which is an odd address. – melpomene Oct 04 '18 at 17:33
  • @melpomene Compile with optimizations enabled. – Lundin Oct 05 '18 at 06:27