-2

My first question wasn't well formulated so here goes again, this time, more well asked and explained.

I want to hide the variables of a struct while being able to initialize the struct statically on the stack. Most solutions out there use the opaque pointer idiom and dynamic memory allocation which isn't always desired.

The idea for this example came from the following post:

https://www.reddit.com/r/C_Programming/comments/aimgei/opaque_types_and_static_allocation/

I know that this is probably ub but I believe it should work fine in most consumers archictures: either 32 bit or 64 bit.

Now you may tell me that sometimes size_t may be bigger than void * and that the void * alignment in the union forcing the union alignment to be that of sizeof(void *) may be wrong, but usually that's never case, maybe it can happen but I see it as the exception not the rule.

Based on the fact that most compilers add padding to align it to either a multiple of 4 or 8 depending on your architecture and that sizeof returns the correct size with padding, sizeof(Vector) and sizeof(RealVector) should be the same, and based on the fact that both Vector and RealVector have the same alignment it should be fine too.

If this is ub, how can I create a sort of scratchpad structure in C in a safe maner? In C++ we have alignas, alignof and placement new which hepls making this ordeal a lot more safer.

If that's not possible to do in C99, will it be more safer in C11 with alignas and alignof?

#include <stdint.h>
#include <stdio.h>

/* In .h */

typedef union Vector {
    uint8_t data[sizeof(void *) + 2 * sizeof(size_t)];
    /* this is here to the force the alignment of the union to that of sizeof(void *) */
    void * alignment;
} Vector;

void vector_initialize_version_a(Vector *);
void vector_initialize_version_b(Vector *);
void vector_debug(Vector const *);

/* In .c */

typedef struct RealVector {
    uint64_t * data;
    size_t length;
    size_t capacity;
} RealVector;

void
vector_initialize_version_a(Vector * const t) {
    RealVector * const v = (RealVector *)t;
    v->data = NULL;
    v->length = 0;
    v->capacity = 8;
}

void
vector_initialize_version_b(Vector * const t) {
    *(RealVector *)t = (RealVector) {
        .data = NULL,
        .length = 0,
        .capacity = 16,
    };
}

void
vector_debug(Vector const * const t) {
    RealVector * v = (RealVector *)t;
    printf("Length: %zu\n", v->length);
    printf("Capacity: %zu\n", v->capacity);
}

/* In main.c */

int
main() {
    /*
    Compiled with:
    clang -std=c99 -O3 -Wall -Werror -Wextra -Wpedantic test.c -o main.exe
    */

    printf("%zu == %zu\n", sizeof(Vector), sizeof(RealVector));

    Vector vector;

    vector_initialize_version_a(&vector);
    vector_debug(&vector);

    vector_initialize_version_b(&vector);
    vector_debug(&vector);

    return 0;
}
João Pires
  • 803
  • 5
  • 14
  • You will need to allocate memory for the `data` How are you going to do so? – 0___________ Feb 22 '21 at 11:52
  • 1
    Well, I tried to answer your first question. Then as I was typing, you changed the code/goalposts to something else. Then I started to rewrite my answer. And then you deleted the question before I could post it. So I'm done helping you. – Lundin Feb 22 '21 at 11:56
  • Why is this better than a comment in the header that says "Don't use these fields"? – stark Feb 22 '21 at 11:58
  • For me it looks like a X-Y problem – 0___________ Feb 22 '21 at 12:08
  • @Lundin, sorry I sometimes get a bit frustrated here. I've deleted my old question because I thought it wasn't clear enough what my question / problem were. You can be rest assure that this is the final version of my question. – João Pires Feb 22 '21 at 12:10
  • 2
    @JoãoPires If people ask you questions in the comments do not **ignore** it, only answer it if you want help. – 0___________ Feb 22 '21 at 12:12

3 Answers3

0

Why nor simple? It avoids the pointer punning

typedef struct RealVector {
    uint64_t * data;
    size_t length;
    size_t capacity;
} RealVector;

typedef struct Vector {
    uint8_t data[sizeof(RealVector)];
} Vector;

typedef union
{
    Vector      v;
    RealVector rv;
} RealVector_union;

void vector_initialize_version_a(void * const t) {
    RealVector_union * const v = t;
    v -> rv.data = NULL;
    v -> rv.length = 0;
    v -> rv.capacity = 8;
}

And

0___________
  • 34,740
  • 4
  • 19
  • 48
  • Because `RealVector` isn't available publicly, it's an incomplete type. – João Pires Feb 22 '21 at 12:08
  • 2
    @JoãoPires IMO it is X-Y not existing problem. – 0___________ Feb 22 '21 at 12:11
  • Is there something missing after the `/* this is here to force the alignment ... */` comment? – Ian Abbott Feb 22 '21 at 12:12
  • @IanAbbott forgot to delete it. It was OPs comment from the original code – 0___________ Feb 22 '21 at 12:13
  • 1
    But doesn't that make `Vector` unaligned? – Ian Abbott Feb 22 '21 at 12:14
  • @IanAbbott Vector will be correctly aligned. If default alignment is not good there are another ways to force the alignemnt – 0___________ Feb 22 '21 at 12:25
  • 1
    Surely you would want the alignment of `Vector` to be at least as great as that of `RealVector` to be any use? On my machine, `_Alignof(Vector)` is 1, but `_Alignof(RealVector)` is 8. Remember, the caller has a variable `Vector v;` and is calling `vector_initialize_version_a(&v);`. – Ian Abbott Feb 22 '21 at 12:34
  • @IanAbbott IMO it does not matter as alignment of the union has to fit all the members likes in this examples: https://godbolt.org/z/nMf5rj https://godbolt.org/z/jYbTj6. That is the reason of having the union. – 0___________ Feb 22 '21 at 13:00
  • @0___________ That only works if the client code is passing you the address of a `RealVector_union` or `RealVector` object, but my understanding is that the client code is passing you the address of a `Vector` object, which you have defined with an alignment of 1. I.e. the client code knows nothing about the `RealVector_union` or `RealVector` types. – Ian Abbott Feb 23 '21 at 14:45
  • Then the safest method is: memcpy – 0___________ Feb 23 '21 at 15:36
0

I'll post my answer from the previous question, which I didn't have to time to post :)

Am I safe doing this?

No, you are not. But instead of finding a way of doing it safe, just error when it's not safe:

#include <assert.h>
#include <stdalign.h>
static_assert(sizeof(Vector) == sizeof(RealVector), "");
static_assert(alignof(Vector) == alignof(RealVector), "");

With checks written in that way, you will know beforehand when there's going to be a problem, and you can then fix it handling the specific environment. And if the checks will not fire, you will know it's fine.

how can I create a sort of scratchpad structure in C in a safe maner?

The only correct way of really doing it safe would be a two step process:

  • first compile a test executable that would output the size and alignment of struct RealVector
  • then generate the header file with proper structure definition struct Vector { alignas(REAL_VECTOR_ALIGNMENT) unigned char data[REAL_VECTOR_SIZE]; };
  • and then continue to compiling the final executable
  • Compilation of test and final executables has to be done using the same compiler options, version and settings and environment.

Notes:

  • Instead of union use struct with alignof
  • uint8_t is an integer with 8-bits. Use char, or best unsigned char, to represent "byte".
  • sizeof(void*) is not guaranteed to be sizeof(uint64_t*)
  • where max alignment is either 4 or 8 - typically on x86_64 alignof(long double) is 16.
KamilCuk
  • 69,546
  • 5
  • 27
  • 60
  • How can `sizeof(void *)` not guaranteed to be `sizeof(uint64_t *)`? Aren't they both pointers? That's so weird... – João Pires Feb 22 '21 at 12:05
  • There is no _guarantee_. But I'll doubt we'll find such architecture, ever. POSIX requires all pointers to be of the same size, gcc compiler also does that. As you seem to want to target x86 architectures specifically, it will be true. I think there was an architecture, that had something along `sizeof(int*) != sizeof(char*)`, but I can't find it on google. – KamilCuk Feb 22 '21 at 12:12
0

One possibility is to define Vector as follows in the .h file:

/* In vector.h file */
struct RealVector {
    uint64_t * data;
    size_t length;
    size_t capacity;
};

typedef union Vector {
    char data[sizeof(struct RealVector)];
    /* these are here to the force the alignment of the union */
    uint64_t * alignment1_;
    size_t alignment2_;
} Vector;

That also defines struct RealVector for use in the vector implementation .c file:

/* In vector.c file */
typedef struct RealVector RealVector;

This has the advantage that the binary contents of Vector actually consists of a RealVector and is correctly aligned. The disadvantage is that a sneaky user could easily manipulate the contents of a Vector via pointer type casting.

A not so legitimate alternative is to remove struct RealVector from the .h file and replace it with an anonymous struct type of the same shape:

/* In vector.h file */
typedef union Vector {
    char data[sizeof(struct { uint64_t * a; size_t b; size_t c; })];
    /* these are here to the force the alignment of the union */
    uint64_t * alignment1_;
    size_t alignment2_;
} Vector;

Then struct RealVector needs to be fully defined in the vector implementation .c file:

/* In vector.c file */
typedef struct RealVector {
    uint64_t * data;
    size_t length;
    size_t capacity;
} RealVector;

This has the advantage that a sneaky user cannot easily manipulate the contents of a Vector without first defining another struct type of the same shape as the anonymous struct type. The disadvantage is that the anonymous struct type that forms the binary representation of Vector is not technically compatible with the RealVector type used in the vector implementation .c file because the tags and member names are different.

Ian Abbott
  • 9,803
  • 11
  • 24