Load of misaligned address and UBsan finding

Question

This question is not about the definition of unaligned data accesses, but why memcpy silences the UBsan findings whereas type casting does not, despite generating the same assembly code.

I have some example code to parse a protocol that sends a byte array segmented into groups of six bytes.

void f(u8 *ba) {
    // I know this array's length is a multiple of 6
    u8 *p = ba;
    u32 a = *(u32 *)p;
    printf("a = %d\n", a);
    p += 4;
    u16 b = *(u16 *)p;
    printf("b = %d\n", b);

    p += 2;
    a = *(u32 *)p;
    printf("a = %d\n", a);
    p += 4;
    b = *(u16 *)p;
    printf("b = %d\n", b);
}

After incrementing my pointer by 6 and doing another 32-bit read, the UBSan reports an error about a misaligned load. I suppress this error using memcpy instead of type-punning, but I don't have a good understanding why. To be clear, here is the same routine without UBSan errors,

void f(u8 *ba) {
    // I know this array's length is a multiple of 6 (
    u8 *p = ba;
    u32 a;
    memcpy(&a, p, 4);
    printf("a = %d\n", a);
    p += 4;
    memcpy(&b, p, 2);
    printf("b = %d\n", b);

    p += 2;
    memcpy(&a, p, 4);
    printf("a = %d\n", a);
    p += 4;
    memcpy(&b, p, 2);
    printf("b = %d\n", b);
}

Both routines compile to identical assembly code (using movl for the 32-bit read and movzwl for the 16-bit read), so why is one undefined behaviour when the other is not? Does memcpy have some special properties that guarantee something?

I don't want to use memcpy here because I can't rely on compilers doing a good enough job optimising it.

Possible duplicate of [What is data alignment? Why and when should I be worried when typecasting pointers in C?](https://stackoverflow.com/questions/38875369/what-is-data-alignment-why-and-when-should-i-be-worried-when-typecasting-pointe) — melpomene, Dec 03 '17 at 15:40
`memcpy` is correct and the cast-dereference version is undefined behaviour. UB means *anything* can happen, which includes generating the same assembly as correct code — M.M, Dec 03 '17 at 21:11
... I'm not sure your second code snippet would compile because `b` is not declared before using. — Leedehai, Feb 28 '19 at 21:48
[C undefined behavior. Strict aliasing rule, or incorrect alignment?](https://stackoverflow.com/q/46790550/608639), [Is a misaligned load due to a cast undefined behavior?](https://stackoverflow.com/q/28893303/608639), [What is a misaligned pointer?](https://stackoverflow.com/q/20183094/608639), etc. — jww, Jun 07 '19 at 18:13

Antti Haapala · Accepted Answer · 2019-08-25T18:17:41.910

14

UB sanitizer is used to detect that the code is not strictly-conforming and depends, in fact, on undefined behaviour that is not guaranteed.

Actually the C standard says that the behaviour is undefined as soon as you cast a pointer to a type for which the address is not suitably aligned. C11 (draft, n1570) 6.3.2.3p7:

A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned 68) for the referenced type, the behavior is undefined.

I.e.

u8 *p = ba;
u32 *a = (u32 *)p; // undefined behaviour if misaligned. No dereference required

The presence of this cast allows a compiler to presume that ba was aligned to 4-byte boundary (on a platform where u32 is required to be thus aligned, which many compilers will do on x86), after which it can generate code that assumes the alignment.

Even on x86 platform, there are instructions that fail spectacularly: innocent-looking code can be compiled into machine code that will cause an abort at runtime. UBSan is supposed to catch this in code that would otherwise look sane and behave "as expected" when you run it, but then fail if compiled with another set of options or different optimization level.

The compiler can generate the correct code for memcpy - and often will, but it is just because the compiler will know that the unaligned access would work and perform well enough on the target platform.

Lastly:

I don't want to use memcpy here because I can't rely on compilers doing a good enough job optimising it.

What you're saying here is: "I want my code to work reliably only whenever compiled by garbage or two-decades-old compilers that generate slow code. Definitely not when compiled with the ones that could optimize it to run fast."

edited Aug 25 '19 at 18:17

answered Dec 03 '17 at 19:26

Antti Haapala

117,318
21
243
279

Thanks, with this explanation and @Dietrich Epp's commentary I see my misunderstanding now. – Charles Dec 03 '17 at 21:58
"C standard says that the behaviour is undefined" - only on architectures which don't support unaligned accesses. On other it may cause large or [small](https://lemire.me/blog/2012/05/31/data-alignment-for-speed-myth-or-reality/) (x86) overhead or no overhead at all (e.g. on DSPs). – yugr Dec 04 '17 at 09:50
2

@yugr thanks for reading my answer. Oh wait. **The behaviour is undefined, regardless of the platform, when it comes to standard**. Of course an implementation may define behaviour where it is undefined. When did you check the manuals of your compiler?! No, GCC, Clang, MSVC, none of these do define it. – Antti Haapala Dec 04 '17 at 09:56
If I'm reading `6.3.2.2-7` right standard allows UB only if resulting type is not aligned correctly but does not mandate that alignment of int must be strictly larger than that of char. As for the manuals, at least GCC on ARM has official `-munaligned-access` flag (which makes a lot of sense because embedded algorithms often need to operate on unaligned data). – yugr Dec 04 '17 at 12:12
As a side note, it always help to cite relevant parts of the standard to prevent further questions. – yugr Dec 04 '17 at 12:14
1

@yugr fixed :D thanks, cited. Even with `-munaligned-access`, the result is that the access is not undefined, but it doesn't change the alignment requirements. It defines a behaviour beyond that required by the standard. My linked-to answer that I am not going to repeat here shows that a crash can occur on x86, just by addressing unaligned shorts in a loop. I didn't even see unaligned access option available for x86 platform on my GCC. – Antti Haapala Dec 04 '17 at 12:54
@yugr It is important to consider the definition of "undefined behaviour" (or as it is written in the standard, with no 'u'), which is found at [C11/3.4.3p1](http://www.iso-9899.info/n1570.html#3.4.3p1): *"behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements"*... That is to say, **the behaviour is undefined because the standard doesn't define the behaviour**. Whether or not an individual implementation *defines* the behaviour is irrelevant, here. We deal in writing *portable* code with the [c] tag – autistic Jun 14 '18 at 11:53
@user1989425 Implementation-defined behavior isn't specified by the Standard either but is perfectly legal. In particular Standard does not define "correct alignment" for `char` and `int` and they are subject to funny compiler options like `-mno-strict-align`. So UB happens only on platforms where `int`s can not be aligned at 1 byte (which x86 is not). – yugr Jun 15 '18 at 08:02
@yugr while it is true that the alignment of `int` need not be strictly different than `char`, could you please be so kind and disclose the option which does cause `printf("%zu", _Alignof(int));` to print 1 on *your* implementation. – Antti Haapala Jun 15 '18 at 09:35

autistic · Answer 2 · 2017-12-03T16:09:29.637

2

The original type of your object would best be u32, an array of u32... Otherwise, you're handling this sensibly by using memcpy. This isn't likely to be a significant bottleneck on modern systems; I wouldn't worry about that.

On some platforms, an integer can't exist at every possible address. Consider the maximum address for your system, we could just postulate upon 0xFFFFFFFFFFFFFFFF. A four-byte integer couldn't possibly exist here, right?

Sometimes optimisations are performed at the hardware to align the bus (the series of wires leading from the CPU to various peripherals, memory and what-not) based on this, and one of those is to assume addresses for various types only occur in multiples of their sizes, for example. A misaligned access on such a platform is likely to cause a trap (segfault).

Hence, UBSan is correctly warning you about this non-portable and difficult to debug problem.

Not only does this issue cause some systems to fail to work entirely, but you'll find your system which permits you to access out of alignment requires a second fetch across the bus to retrieve the second portion of the integer, anyway.

There are a few other problems in this code.

printf("a = %d\n", a);

If you wish to print an int, you should use %d. However, your argument is a u32.Don't mismatch your arguments like this; that's also undefined behaviour. I don't know for certain how u32 is defined for you, but I guess the closest standard-compliant feature is probably uint32_t (from <stdint.h>). You should use "%"PRIu32 as your format string in any place you want to print a uint32_t. The PRIu32 (from <inttypes.h>) symbol provides an implementation-defined sequence of characters that will be recognised by the implementations printf function.

Note that this problem is repeated elsewhere, where you're using the u16 type instead:

printf("b = %d\n", b);

"%"PRIu16 will probably suffice there.

edited Dec 03 '17 at 16:09

answered Dec 03 '17 at 15:50

autistic

14,356
2
33
74

Thanks for the comment. I don't understand why it would be better for this to be a u32 array (rather than u8), the function is parsing some structured data so it's nice to use pointer arithmetic to step over varying numbers of records at the byte level, but maybe I'm running into a trap. I understand what technically an alignment fault is, but I don't know why the memcpy version avoids said problems when it produces identical assembly code, and the prints are sloppy just because it's throw-away code :) – Charles Dec 03 '17 at 16:16
2

@dune.rocks: The `memcpy` doesn't result in undefined behavior because the C standard says that it's the same as copying a series of `u8`. The fact that the resulting assembly is the same or different is completely irrelevant for figuring out if behavior is undefined... for that, you have to look at the C standard. – Dietrich Epp Dec 03 '17 at 19:21
@DietrichEpp: If an implementation specifies that converting a misaligned address to `uint32_t*` may fail in arbitrary fashion, then the implementation may assume that a `memcpy` operand of type `uint32_t*` will hold a word-aligned address. If, however, an implementation specifies that a round-trip conversion between any two pointer types will yield a pointer equivalent to the original, and that a direct conversion of any pointer type to `void*` will yield the same result as converting it to some other type and then to `void*`, that would in turn define the behavior of code which... – supercat Dec 03 '17 at 19:37
...casts a misaligned pointer to `uint32_t*` and then passes that to `memcpy`. I would suggest that an implementation which is going to assume that a `uint32_t*` which is passed to `memcpy` will be aligned should explicitly document that pointers which are not aligned for their respective types may behave as trap representations. That would justify the generally-useful `memcpy` behavior. Unfortunately, many compiler writers specify many things but fail to specify which natural inferences do or do not follow. – supercat Dec 03 '17 at 19:39
1

@supercat: "If, ... a round-trip conversion between any two pointer types will yield a pointer equivalent to the original," is the part that is false here, see e.g. n1548 6.3.2.3 paragraph 7 which states that conversions are undefined if the resulting pointer would not be correctly aligned, and only otherwise the result shall be equal to the original pointer. The issue of passing `uint32_t *` to `memcpy()` is moot because the `uint32_t *` is correctly aligned in the first place. – Dietrich Epp Dec 03 '17 at 19:50
@supercat if an implementation specifies that 5 divided by 0 equals 42 then so be it. Unless any particular implementation is specified in the question, we kind of assume the question is about the C standard here... – Antti Haapala Dec 03 '17 at 21:02
@dune.rocks I can see you managed to get a better understanding from the other answers, however there is *other* undefined behaviour here I wonder if you've understood yet. Can you tell me what type you should pass corresponding to a `%d` format directive, just to confirm that you understand that (latter) part of my answer which the other answer doesn't cover? – autistic Dec 04 '17 at 00:39
@DietrichEpp: Implementations are free to define or not define the behavior of an unaligned `uint8_t*` to `uint32_t*` as they see fit. Some do, some don't. On *those implementations that choose to define a means by which a misaligned `uint32_t*` pointer can be brought into existence*, such an address should behave like any other address if converted to `void*` and passed to `memcpy`. – supercat Dec 04 '17 at 02:44
@supercat: Implementations are indeed free to disregard the standard, and they do, but the standard is quite clear that this is undefined behavior, and that's exactly what `ubsan` is designed to detect with its diagnostics. – Dietrich Epp Dec 04 '17 at 14:20
@DietrichEpp: An implementation that offers stronger guarantees than what the Standard requires is hardly "disregarding the Standard". Likewise, the only thing the Standard says about most programs that rely upon behaviors not mandated by the Standard is that they are not Strictly Conforming. Further, quality tools to detect portability or reliability problems should in many cases squawk at constructs that don't involve UB, but have *different* defined behavior on different platforms. Given `uint32_t x=1,y=2, z=(x-y) > 5;`, a conforming implementation must set `z` to 0 on some... – supercat Dec 04 '17 at 16:51
...platforms at 1 on others, but a platform which has type `uint32_t` must do one or the other without side-effects. A portability-checking implementation, however, should probably squawk in that case *even though such behavior would be non-conforming*. – supercat Dec 04 '17 at 16:52
@supercat: "Implementation-defined behavior" is the term for what you are describing, and it is distinct from "undefined behavior", which is what ubsan is designed to detect. I'm not sure how a conforming implementation could set `z` to anything other than 1, could you explain how that happens? – Dietrich Epp Dec 04 '17 at 18:16
@DietrichEpp: From the point of view of the Standard, the only difference between Implementation-Defined behavior and Undefined Behavior is that conforming implementations would be required to define the former even when it would be impractical to do so, while implementations are not required to define behaviors in the latter case. The lack of a requirement does not imply any judgment that quality implementations shouldn't define behaviors when practical; reading the published Rationale, it's clear the authors expected that implementations for common platforms would define behaviors... – supercat Dec 04 '17 at 18:46
...even in cases where those targeting less-common platforms might not. Read, e.g. the rationale for saying that unsigned types smaller than "int" should promote to signed "int". Would their reasoning make sense if they didn't expect that most implementations, including the vast majority of future ones, would process `uint1=ushort1*ushort2;` identically to `uint1=(unsigned)ushort1*ushort2;` even in cases where the product would exceed `INT_MAX`? – supercat Dec 04 '17 at 18:50
@supercat: I see how the example works, and yet ubsan is designed to detect undefined behavior and it seems to be working correctly in this case. The standard says that the behavior is undefined, and the sanitizer emits an error, exactly like it says on the box. – Dietrich Epp Dec 04 '17 at 18:59
@DietrichEpp: Behavior would be undefined on an implementation where `uint32_t*` values with the lower bits are set are trap representations. If one wants to validate that code will work such implementations, a sanitizer that regards such values as trap representations may be useful. I would suggest that a quality implementation which generally documents the bit patterns associated with a type should specify whether any bit patterns will be treated as trap representations, but failure to do so may be a Quality-of-Implementation rather than conformance issue. – supercat Dec 04 '17 at 20:08
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/160460/discussion-between-dietrich-epp-and-supercat). – Dietrich Epp Dec 04 '17 at 20:25

Load of misaligned address and UBsan finding

2 Answers2