13

Look at this code:

struct A {
    short s;
    int i;
};
struct B {
    short s;
    int i;
};

union U {
    A a;
    B b;
};

int fn() {
    U u;
    u.a.i = 1;
    return u.b.i;
}

Is it guaranteed that fn() returns 1?

Note: this is a follow-up question to this.

NathanOliver
  • 150,499
  • 26
  • 240
  • 331
geza
  • 26,117
  • 6
  • 47
  • 111

1 Answers1

11

Yes, this is defined behavior. First lets see what the standard has to say about A and B. [class.prop]/3 has

A class S is a standard-layout class if it:

  • has no non-static data members of type non-standard-layout class (or array of such types) or reference,
  • has no virtual functions and no virtual base classes,
  • has the same access control for all non-static data members,
  • has no non-standard-layout base classes,
  • has at most one base class subobject of any given type,
  • has all non-static data members and bit-fields in the class and its base classes first declared in the same class, and
  • [...] (nothing said here has any bearing in this case)

So A and B are both standard layout types. If we look at [class.mem]/23

Two standard-layout struct types are layout-compatible classes if their common initial sequence comprises all members and bit-fields of both classes ([basic.types]).

and [class.mem]/22

The common initial sequence of two standard-layout struct types is the longest sequence of non-static data members and bit-fields in declaration order, starting with the first such entity in each of the structs, such that corresponding entities have layout-compatible types, either both entities are declared with the no_­unique_­address attribute ([dcl.attr.nouniqueaddr]) or neither is, and either both entities are bit-fields with the same width or neither is a bit-field.

and [class.mem]/25

In a standard-layout union with an active member of struct type T1, it is permitted to read a non-static data member m of another union member of struct type T2 provided m is part of the common initial sequence of T1 and T2; the behavior is as if the corresponding member of T1 were nominated. [ Example:

struct T1 { int a, b; };
struct T2 { int c; double d; };
union U { T1 t1; T2 t2; };
int f() {
  U u = { { 1, 2 } };   // active member is t1
  return u.t2.c;        // OK, as if u.t1.a were nominated
}

— end example ] [ Note: Reading a volatile object through a glvalue of non-volatile type has undefined behavior ([dcl.type.cv]). — end note ]

Then we have that the classes have the same common initial sequence, are laid out the same, and accessing the same member of the non-active type is treated as if accessing that member of the active type.

anonymous
  • 3,754
  • 2
  • 15
  • 36
NathanOliver
  • 150,499
  • 26
  • 240
  • 331
  • Is it guaranteed that `offsetof` of `A::i` and `B::i` are the same? The text just says "OK, ... nominated". Or what guarantees that 1 will be returned? It seems logical of course, I just fail to see the guarantee of it. Even in the presence of common initial sequence. I don't see guaranteed that the common initial sequence laid out the same. Or is it guaranteed somewhere? – geza Oct 29 '18 at 18:00
  • @geza It is in the text above the code: *the behavior is as if the corresponding member of T1 were nominated.* – NathanOliver Oct 29 '18 at 18:02
  • @geza Let me add a little more to show the classes have to be laid out the same – NathanOliver Oct 29 '18 at 18:03
  • @NathanOliver does `u` has an active member in the OP question? – Jans Oct 29 '18 at 18:17
  • 1
    @Jans From my reading `u.a.i = 1;` makes it active: https://timsong-cpp.github.io/cppwp/class.union#5 – NathanOliver Oct 29 '18 at 18:36
  • @geza I've updated the answer. Let me know if anything is still unclear to you. – NathanOliver Oct 29 '18 at 18:36
  • @NathanOliver: thank for the answer! This is still not clear for me: "Then we have that the classes have the same initial common sequence, are laided out the same". Why? Where is it written that a common initial sequence is laid out the same? – geza Oct 29 '18 at 18:50
  • @geza [class.mem]/23: *Two standard-layout struct types are layout-compatible classes if their common initial sequence comprises all members and bit-fields of both classes ([basic.types]).* – NathanOliver Oct 29 '18 at 18:55
  • @NathanOliver: where does this statement say this? – geza Oct 29 '18 at 18:59
  • @geza They are *layout-compatible* so they are laid out the same, at least for the initial common sequence. – NathanOliver Oct 29 '18 at 19:02
  • So, do you say that the word *layout-compatible* implies this? I'm not sure that I agree with this. Usually, this kind of things specified explicitly. – geza Oct 29 '18 at 19:04
  • @geza I think I need to add [\[basic.types\]/11](https://timsong-cpp.github.io/cppwp/basic.types#11):Two types cv1 T1 and cv2 T2 are layout-compatible types if T1 and T2 are the same type, layout-compatible enumerations, or layout-compatible standard-layout class types. Does that make it make more sense for you? – NathanOliver Oct 29 '18 at 19:08
  • Unfortunately, no. I don't see that why *layout-compatible* mean "members of layout-compatible types are laid out the same". Of course, it is the only sane behavior, but it still not specified this way. At least, I don't see where it is specified like this. – geza Oct 29 '18 at 19:24
  • @NathanOliver: If a compiler meaningfully upholds the Common Initial Sequence guarantee, taking the address of `someUnion.struct1.memberX` should yield a pointer that can be used to view the live state of someUnion.struct2.memberX` if both struct members are matching parts of a Common Initial Sequence. The easiest way to guarantee that would be to place the fields at matching offsets, though I don't know if the Standard would forbid an implementation from using some other weird and wacky means instead. – supercat Oct 31 '18 at 20:34