Heap-free pimpl. Incorrect or superstition?

Question

Edit: This question dates from before C++17. These days std::launder or equivalent should be added to the line noise. I don't have time to update the code to match right now.

I am aspiring to separate interface from implementation. This is primarily to protect code using a library from changes in the implementation of said library, though reduced compilation times are certainly welcome.

The standard solution to this is the pointer to implementation idiom, most likely to be implemented by using a unique_ptr and carefully defining the class destructor out of line, with the implementation.

Inevitably this raises concerns about heap allocation. I am familiar with "make it work, then make it fast", "profile then optimise" and such wisdom. There are also articles online, e.g. gotw, which declare the obvious workaround to be brittle and non-portable. I have a library which currently contains no heap allocations whatsoever - and I'd like to keep it that way - so let's have some code anyway.

#ifndef PIMPL_HPP
#define PIMPL_HPP
#include <cstddef>

namespace detail
{
// Keeping these up to date is unfortunate
// More hassle when supporting various platforms
// with different ideas about these values.
const std::size_t capacity = 24;
const std::size_t alignment = 8;
}

class example final
{
 public:
  // Constructors
  example();
  example(int);

  // Some methods
  void first_method(int);
  int second_method();

  // Set of standard operations
  ~example();
  example(const example &);
  example &operator=(const example &);
  example(example &&);
  example &operator=(example &&);

  // No public state available (it's all in the implementation)
 private:
  // No private functions (they're also in the implementation)
  unsigned char state alignas(detail::alignment)[detail::capacity];
};

#endif

This doesn't look too bad to me. Alignment and size can be statically asserted in the implementation. I can choose between overestimating both (inefficient) or recompiling everything if they change (tedious) - but neither option is terrible.

I'm not certain this sort of hackery will work in the presence of inheritance, but as I don't much like inheritance in interfaces I don't mind too much.

If we boldly assume that I've written the implementation correctly (I'll append it to this post, but it's an untested proof of concept at this point so that's not a given), and both size and alignment are greater than or equal to that of the implementation, then does the code exhibit implementation defined, or undefined, behaviour?

#include "pimpl.hpp"
#include <cassert>
#include <vector>

// Usually a class that has behaviour we care about
// In this example, it's arbitrary
class example_impl
{
 public:
  example_impl(int x = 0) { insert(x); }

  void insert(int x) { local_state.push_back(3 * x); }

  int retrieve() { return local_state.back(); }

 private:
  // Potentially exotic local state
  // For example, maybe we don't want std::vector in the header
  std::vector<int> local_state;
};

static_assert(sizeof(example_impl) == detail::capacity,
              "example capacity has diverged");

static_assert(alignof(example_impl) == detail::alignment,
              "example alignment has diverged");

// Forwarding methods - free to vary the names relative to the api
void example::first_method(int x)
{
  example_impl& impl = *(reinterpret_cast<example_impl*>(&(this->state)));

  impl.insert(x);
}

int example::second_method()
{
  example_impl& impl = *(reinterpret_cast<example_impl*>(&(this->state)));

  return impl.retrieve();
}

// A whole lot of boilerplate forwarding the standard operations
// This is (believe it or not...) written for clarity, so none call each other

example::example() { new (&state) example_impl{}; }
example::example(int x) { new (&state) example_impl{x}; }

example::~example()
{
  (reinterpret_cast<example_impl*>(&state))->~example_impl();
}

example::example(const example& other)
{
  const example_impl& impl =
      *(reinterpret_cast<const example_impl*>(&(other.state)));
  new (&state) example_impl(impl);
}

example& example::operator=(const example& other)
{
  const example_impl& impl =
      *(reinterpret_cast<const example_impl*>(&(other.state)));
  if (&other != this)
    {
      (reinterpret_cast<example_impl*>(&(this->state)))->~example_impl();
      new (&state) example_impl(impl);
    }
  return *this;
}

example::example(example&& other)
{
  example_impl& impl = *(reinterpret_cast<example_impl*>(&(other.state)));
  new (&state) example_impl(std::move(impl));
}

example& example::operator=(example&& other)
{
  example_impl& impl = *(reinterpret_cast<example_impl*>(&(other.state)));
  assert(this != &other); // could be persuaded to use an if() here
  (reinterpret_cast<example_impl*>(&(this->state)))->~example_impl();
  new (&state) example_impl(std::move(impl));
  return *this;
}

#if 0 // Clearer assignment functions due to MikeMB
example &example::operator=(const example &other) 
{
  *(reinterpret_cast<example_impl *>(&(this->state))) =
      *(reinterpret_cast<const example_impl *>(&(other.state)));
  return *this;
}   
example &example::operator=(example &&other) 
{
  *(reinterpret_cast<example_impl *>(&(this->state))) =
          std::move(*(reinterpret_cast<example_impl *>(&(other.state))));
  return *this;
}
#endif

int main()
{
  example an_example;
  example another_example{3};

  example copied(an_example);
  example moved(std::move(another_example));

  return 0;
}

I know that's pretty horrible. I don't mind using code generators though, so it's not something I'll have to type out repeatedly.

To state the crux of this over-long question explicitly, are the following conditions sufficient to avoid UB|IDB?

Size of state matches size of impl instance
Alignment of state matches alignment of impl instance
All five standard operations implemented in terms of those of the impl
Placement new used correctly
Explicit destructor calls used correctly

If they are, I'll write enough tests for Valgrind to flush out the several bugs in the demo. Thank you to any who get this far!

I generally agree with you: techniques like this are crucial for performance, especially in multi-threaded applications (the heap is typically guarded by some variety of mutual exclusion, and potentially many other threads: keeping stuff on the stack with this technique can mean the difference between a multithreaded app scaling and not). — rts1, Nov 17 '15 at 23:49
Avoiding all heap allocations means avoiding **all standard library containers**. including `std::string`, plus avoiding using standard exceptions. That's pretty restrictive. But if it's important enough for that, then I think most certainly it's a good idea to use placement new. There is a [GOTW](http://www.gotw.ca/gotw/028.htm) about precisely this, however. Very much worth looking up. — Cheers and hth. - Alf, Nov 17 '15 at 23:55
What is the point of all those reinterprete casts? You can just forward declare èxample_impl` and store the pointer, that was returned from placement new. Aside from this, im pretty sure your construct is leagle - although i can't cite the relevant part in the standard — MikeMB, Nov 17 '15 at 23:56
Might be relevant: http://stackoverflow.com/questions/20984808/c-strict-aliasing-when-not-using-pointer-returned-by-placement-new — MikeMB, Nov 18 '15 at 00:01
@Cheersandhth.-Alf No, avoiding all heap allocations means avoiding `std::allocator`. You can pass your own allocator to standard library containers that allocate on the stack. — orlp, Nov 18 '15 at 00:15
@orlp: Creating a standard allocator that allocates "on the stack" would be a really bad idea. At the point where the allocator gets to call e.g. "alloca", it's in some nested stack frame. Returning up then yields serious UB for use of the allocated chunk. But with knowledge of the allocation statistics in a given application it can be possible to do better than the default allocator. In particular one may use free lists for certain allocation sizes, or more general a small objects allocator as described in e.g. Alexandrescu's "Modern C++ Design". This is still heap allocation though. — Cheers and hth. - Alf, Nov 18 '15 at 00:23
@Cheersandhth.-Alf No, no UB involved. See for example my implementation here: https://github.com/orlp/libop/blob/master/bits/memory.h#L45. You don't use `alloca`, you simply preallocate a maximum on the stack using `char storage[N];`. — orlp, Nov 18 '15 at 00:26
@orlp: A preallocated area to sub-allocate out of is hardly allocation "on the stack", regardless of whether it's on the machine stack or not (it's 100% irrelevant where that area resides). That's still heap allocation. It's your custom heap allocation. — Cheers and hth. - Alf, Nov 18 '15 at 00:33
Hm, strange idea. `pimpl` used to hide data, because of hide of code is simple via `static` and `anonymous namespaces`, so in your case `pimpl` can only protect from shrinking of new `pimpl` implementation, what is unlikely happened, because of data have tendency to grow. — fghj, Nov 18 '15 at 01:15
@MikeMB the reinterp casts are definitely ugly. Difficult to hide in a free function too, since state is private. Forward declaring `example_impl` means leaking its existence in the header though. Storing a pointer to it in `example` would work, but means carrying around state at runtime instead of resolving at compile time. — Jon Chesterfield, Nov 19 '15 at 13:01
@user1034749 the idea is to present a stable ABI. Depending on intended usage you might even present larger-than-required value for `capacity` to allow for future growth. — M.M, Nov 22 '15 at 21:09

MikeMB · Accepted Answer · 2015-11-22T23:50:05.843

5

Yes, this is perfectly safe and portable code.

However, there is no need to use placement new and explicit destruction in your assignment operators. Aside from it being exception safe and more efficient, I'd argue it's also much cleaner to just use the assignment operator of example_impl:

//wrapping the casts
const example_impl& castToImpl(const unsigned char* mem) { return *reinterpret_cast<const example_impl*>(mem);  }
      example_impl& castToImpl(      unsigned char* mem) { return *reinterpret_cast<      example_impl*>(mem);  }


example& example::operator=(const example& other)
{
    castToImpl(this->state) = castToImpl(other.state);
    return *this;
}

example& example::operator=(example&& other)
{
    castToImpl(this->state) = std::move(castToImpl(other.state));
    return *this;
}

Personally, I also would use std::aligned_storage instead of an manually aligned char array, but I guess thats a matter of taste.

edited Nov 22 '15 at 23:50

answered Nov 22 '15 at 20:56

MikeMB

17,569
7
51
93

Usage of `aligned_storage` is **NOT** matter of taste, with `reinterpret_cast` and not properly align storage you can get crash even on such align tolerance platforms as `x86/amd64`, not mention others like `sparc`, `arm` etc. On `x86/amd64` the problem may lay in usage of `double` in of method of `example_impl`, and emissions by compiler `sse2` instructions – fghj Nov 22 '15 at 23:36
2

@user1034749: So why is this a problem? The alternative in the question is to use a properly aligned unsigned char array via the alignas operator - both solutions are correct. – MikeMB Nov 22 '15 at 23:37
@user1034749 maybe you overlooked `alignas(detail::alignment)` in OP's code? `std::aligned_storage` is a syntactic-sugar wrapper for aligned unsigned char array or similar, it doesn't provide any functionality that the language didn't already offer. – M.M Nov 23 '15 at 00:33
I think I'd rather leave the casts exposed than route them via an unsigned char *. Calling the impl's assignment operator is uncontroversially better though, I shall do so. Thank you. – Jon Chesterfield Dec 03 '15 at 18:54
@JonChesterfield: In the end, I'd assume that in both cases the function calls / local variables would be optimized away, so do whatever you prefer – MikeMB Dec 04 '15 at 00:11

Heap-free pimpl. Incorrect or superstition?

1 Answers1

Linked