Complete encapsulation without malloc

Question

I was experimenting with C11 and VLAs, trying to declare a struct variable on the stack with only an incomplete declaration. The objective is to provide a mechanism to create a variable of some struct type without showing the internals (like the PIMPL idiom) but without the need to create the variable on the heap and return a pointer to it. Also, if the struct layout changes, I don't want to recompile every file that uses the struct.

I have managed to program the following:

private.h:

#ifndef PRIVATE_H_
#define PRIVATE_H_

typedef struct A{
    int value;
}A;

#endif /* PRIVATE_H_ */

public.h:

#ifndef PUBLIC_H_
#define PUBLIC_H_

typedef struct A A;

size_t A_getSizeOf(void);

void A_setValue(A * a, int value);

void A_printValue(A * a);

#endif /* PUBLIC_H_ */

implementation.c:

#include "private.h"
#include "stdio.h"

size_t A_getSizeOf(void)
{
    return sizeof(A);
}

void A_setValue(A * a, int value)
{
    a->value = value;
}

void A_printValue(A * a)
{
    printf("%d\n", a->value);
}

main.c:

#include <stdalign.h>
#include <stddef.h>

#include "public.h"

#define createOnStack(type, variable) \
    alignas(max_align_t) char variable ## _stack[type ## _getSizeOf()]; \
    type * variable = (type *)&variable ## _stack

int main(int argc, char *argv[]) {
    createOnStack(A, var);

    A_setValue(var, 5335);
    A_printValue(var);
}

I have tested this code and it seems to work. However I'm not sure if I'm overlooking something (like aliasing, alignment or something like that) that could be dangerous or unportable, or could hurt performance. Also I want to know if there are better (portable) solutions to this problem in C.

You cannot sensibly do this without recompiling when the struct layout changes as sizeof will be optimized to a compile time constant if you use a VLA or alloca — Vality, Aug 28 '14 at 00:32
@Vality: look again at the code - that would be a link-time optimization; Mabus' reasoning should be sound — Christoph, Aug 28 '14 at 00:34
@Mabus thanks for posting this. I hadn't thought of using a VLA of chars to provide storage for other types. — Jon Chesterfield, Jun 25 '15 at 23:50

Christoph · Answer 1 · 2014-08-28T07:27:35.083

4

This of course violates the effective typing rules (aka strict aliasing) because the C language does not allow an object of tye char [] to be accessed through a pointer that does not have that type (or a compatible one).

You could disable strict aliasing analysis via compiler flags like -fno-strict-aliasing or attributes like

#ifdef __GNUC__
#define MAY_ALIAS __attribute__((__may_alias__))
#else
#define MAY_ALIAS
#endif

(thanks go to R.. for pointing out the latter), but even if you do not do so, in practice everything should work just fine as long as you only ever use the variable's proper name to initialize the typed pointer.

Personally, I'd simplify your declarations to something along the lines of

#define stackbuffer(NAME, SIZE) \
    _Alignas (max_align_t) char NAME[SIZE]

typedef struct Foo Foo;
extern const size_t SIZEOF_FOO;

stackbuffer(buffer, SIZEOF_FOO);
Foo *foo = (void *)buffer;

The alternative would be using the non-standard alloca(), but that 'function' comes with its own set of issues.

edited Aug 28 '14 at 07:27

answered Aug 28 '14 at 00:24

Christoph

149,808
36
172
230

1

If you're willing to assume a GNU-C-compatible compiler with `-fno-strict-aliasing`, then rather than ruin optimization of the whole program with that flag, you should put `__attribute__((__may_alias__))` on the type `Foo`. This should achieve the results just for the one type. – R.. GitHub STOP HELPING ICE Aug 28 '14 at 02:00
Why it violates strict aliasing? I thought that a pointer to char can alias with any other pointer without problems. – Mabus Aug 28 '14 at 06:54
@Mabus: a pointer to char may alias anything, but this is the opposite case - a pointer to `Foo` aliasing a character array; it's better to think in terms of effective types: the memory block `buffer` has effective type `char[]`, but you're accessing it as `Foo` – Christoph Aug 28 '14 at 07:00
@Christoph I thought that alias was conmutative. But even if it's not the case, as I'm only going to access the memory trough the pointer, can I ignore aliasing? – Mabus Aug 28 '14 at 07:04
1

@Mabus: aliasing is symmetric, effective typing is not - the rules are that all objects (think memory locations) have a real type, and accessing them through an expression with incompatible type is UB; anyway, if you only use `buffer` to initialize the `Foo*` and not to read or modify the data, you should be fine – Christoph Aug 28 '14 at 07:08
It appears OP wants to extend this to use VLA such that `A_getSizeOf(void)` would not always return the same value in a given executable. This extension appears to not work with the alternative idea of a global `extern const size_t SIZEOF_FOO;`. BTW: nice Q & A. – chux - Reinstate Monica Aug 29 '14 at 14:43
@chux: as I understood it, he only wants compatibility at link-time when releasing a new version of his code; a given version only ever comes with a single definition of `struct Foo` and thus a constant return value of `A_getSizeOf(void)` – Christoph Aug 29 '14 at 15:59
1

"a given version only ever comes with a single definition of struct Foo" is true, but it is an incomplete definition of `struct Foo` that does not change with new releases. `typedef struct A A;` is the global part and only _pointers_ to type `A` are used globally. The not-"implementation.c" code never sees the inner workings of `A` nor `sizeof(A)`. The size of `A` is hidden and _could_ be dynamic, hence the global function `A_getSizeOf()`. OP's scheme looks intriguing as it appears OP can get away with it. – chux - Reinstate Monica Aug 29 '14 at 16:29
A_getSizeOf(void) must return the same value in the same executable if implementation.c is statically linked (but in that case I expect that I can change the implementation, recompile only the implementation and relink). However, if implementation.c is part of a shared library, I think that changing the implementation won't break the ABI (I haven't tested that yet). AFAIK the same would happen if A_getSizeOf() is a external variable defined in implementation.c, but I'm not 100% sure. – Mabus Aug 29 '14 at 17:16
In fact, as @chux says, using a function is more flexible, and allows the size to change, but that wasn't planned and I can't imagine a possible use of that (maybe a daemon that can be updated on the fly, with some locking for changing libraries? I don't know if that's possible XD). – Mabus Aug 29 '14 at 17:22
@Mabus I now see your "VLA" discussion was only with `variable ## _stack[type ## _getSizeOf()]`. I was thinking you were prepping for a VLA also in `typedef struct A{ int value; char varray[0] }A;`. – chux - Reinstate Monica Aug 29 '14 at 19:23
That's a "flexible array member", not a true VLA. – Mabus Aug 29 '14 at 20:09

score 1 · Answer 2 · answered Jun 25 '15 at 23:44

I am considering adopting a strategy similar to the following to solve essentially the same problem. Perhaps it will be of interest despite being a year late.

I wish to prevent clients of a struct from accessing the fields directly, in order to make it easier to reason about their state and easier to write reliable design contracts. I'd also prefer to avoid allocating small structures on the heap. But I can't afford a C11 public interface - much of the joy of C is that almost any code knows how to talk to C89.

To that end, consider the adequate application code:

#include "opaque.h"
int main(void)
{
  opaque on_the_stack = create_opaque(42,3.14); // constructor
  print_opaque(&on_the_stack);
  delete_opaque(&on_the_stack); // destructor
  return 0;
}

The opaque header is fairly nasty, but not completely absurd. Providing both create and delete functions is mostly for the sake of consistency with structs where calling the destructor actually matters.

/* opaque.h */
#ifndef OPAQUE_H
#define OPAQUE_H

/* max_align_t is not reliably available in stddef, esp. in c89 */
typedef union
{
  int foo;
  long long _longlong;
  unsigned long long _ulonglong;
  double _double;
  void * _voidptr;
  void (*_voidfuncptr)(void);
  /* I believe the above types are sufficient */
} alignment_hack;

#define sizeof_opaque 16 /* Tedious to keep up to date */
typedef struct
{
  union
  {
    char state [sizeof_opaque];
    alignment_hack hack;
  } private;
} opaque;
#undef sizeof_opaque /* minimise the scope of the macro */

void print_opaque(opaque * o);
opaque create_opaque(int foo, double bar);
void delete_opaque(opaque *);
#endif

Finally an implementation, which is welcome to use C11 as it's not the interface. _Static_assert(alignof...) is particularly reassuring. Several layers of static functions are used to indicate the obvious refinement of generating the wrap/unwrap layers. Pretty much the entire mess is amenable to code gen.

#include "opaque.h"

#include <stdalign.h>
#include <stdio.h>

typedef struct
{
  int foo;
  double bar;
} opaque_impl;

/* Zero tolerance approach to letting the sizes drift */
_Static_assert(sizeof (opaque) == sizeof (opaque_impl), "Opaque size incorrect");
_Static_assert(alignof (opaque) == alignof (opaque_impl), "Opaque alignment incorrect");

static void print_opaque_impl(opaque_impl *o)
{
  printf("Foo = %d and Bar = %g\n",o->foo,o->bar);
}

static void create_opaque_impl(opaque_impl * o, int foo, double bar)
{
  o->foo = foo;
  o->bar = bar;
}

static void create_opaque_hack(opaque * o, int foo, double bar)
{
   opaque_impl * ptr = (opaque_impl*)o;
   create_opaque_impl(ptr,foo,bar);
}

static void delete_opaque_impl(opaque_impl *o)
{
  o->foo = 0;
  o->bar = 0;
}

static void delete_opaque_hack(opaque * o)
{
   opaque_impl * ptr = (opaque_impl*)o;
   delete_opaque_impl(ptr);
}

void print_opaque(opaque * o)
{
  return print_opaque_impl((opaque_impl*)o);
}

opaque create_opaque(int foo, double bar)
{
  opaque tmp;
  unsigned int i;
  /* Useful to zero out padding */
  for (i=0; i < sizeof (opaque_impl); i++)
    {
      tmp.private.state[i] = 0;
    }
  create_opaque_hack(&tmp,foo,bar);
  return tmp;
}

void delete_opaque(opaque *o)
{
  delete_opaque_hack(o);
}

The drawbacks I can see myself:

Changing the size define manually would be irritating
The casting should hinder optimisation (I haven't checked this yet)
This may violate strict pointer aliasing. Need to re-read the spec.

I am concerned about accidentally invoking undefined behaviour. I would also be interested in general feedback on the above, or whether it looks like a credible alternative to the inventive VLA technique in the question.

The most obvious problem that I see is that if you change sizeof_opaque you can't have binary compatibility. With the VLA implementation, you can change the size of the struct and the compiled programs still work (supposing that you are programming a shared library). Also note that GCC and other compilers have an attribute similar to alignas (aligned), if you prefer to write C99 code with extensions. Also, when you are creating a struct inside another outside your library, you will have to malloc it, in my case, because you don't know the size at compile time. — Mabus, Jun 26 '15 at 07:15
@Mabus That's true, changing the size would break binary compatibility. VLA or malloc are probably the only workarounds. On the bright side, composition of opaque structures works nicely when the size is fixed, without the heap. Always tradeoffs! Binary compatibility of libraries is something I need to give (much) more thought to. — Jon Chesterfield, Jun 26 '15 at 09:13

Complete encapsulation without malloc

2 Answers2