Thread-safe initialization of function-local static const objects

Question

This question made me question a practice I had been following for years.

For thread-safe initialization of function-local static const objects I protect the actual construction of the object, but not the initialization of the function-local reference referring to it. Something like this:

namespace {
  const some_type& create_const_thingy()
  {
     lock my_lock(some_mutex);
     static const some_type the_const_thingy;
     return the_const_thingy;
  }
}

void use_const_thingy()
{
  static const some_type& the_const_thingy = create_const_thingy();

  // use the_const_thingy

}

The idea is that locking takes time, and if the reference is overwritten by several threads, it won't matter.

I'd be interested if this is

safe enough in practice?
safe according to The Rules? (I know, the current standard doesn't even know what "concurrency" is, but what about trampling over an already initialized reference? And do other standards, like POSIX, have something to say that's relevant to this?)

The reason I want to know this is that I want to know whether I can leave the code as it is or whether I need to go back and fix this.

For the inquiring minds:

Many such function-local static const objects I used are maps which are initialized from const arrays upon first use and used for lookup. For example, I have a few XML parsers where tag name strings are mapped to enum values, so I could later switch over the tags' enum values.

Since I got some answers as to what to do instead, but haven't got an answer to my actual questions (see 1. and 2. above), I'll start a bounty on this. Again:
I am not interested in what I could do instead, I do really want to know about this.

I don't see how your question is significantly different from the one you reference. And hasn't this been asked many times before in one form or another? For example, http://stackoverflow.com/questions/1270927/are-function-static-variables-thread-safe-in-gcc. — , Jun 02 '10 at 08:09
@Neil: I am not asking about double-checked locking etc. in general, but specifically about not protecting the assignment of a simple address. I haven't found anything regarding this, but I'd be gladly referred to it, if it exists here. — sbi, Jun 02 '10 at 08:15

score 14 · Accepted Answer · answered Jun 22 '10 at 06:23

This is my second attempt at an answer. I'll only answer the first of your questions:

safe enough in practice?

No. As you're stating yourself you're only ensuring that the object creation is protected, not the initialization of the reference to the object.

In absence of a C++98 memory model and no explicit statements from the compiler vendor, there are no guarantees that writing to the memory representing the actual reference and the writing to the memory that holds the value of the initialization flag (if that is how it is implemented) for the reference are seen in the same order from multiple threads.

As you also say, overwriting the reference several times with the same value should make no semantic difference (even in the presence of word tearing, which is generally unlikely and perhaps even impossible on your processor architecture) but there's one case where it matters: When more than one thread races to call the function for the first time during program execution. In this case it is possible for one or more of these threads to see the initialization flag being set before the actual reference is initialized.

You have a latent bug in your program and you need to fix it. As for optimizations I'm sure there are many besides using the double-checked locking pattern.

Are you really suggesting there might be implementations out there which first set the initialization flag and then assign the pointer? __Edit:__ Oh wait, right. The underlying platform could re-order the writes if it fells like it! That is indeed right. I hadn't thought of this at all, but this really blows it. If nobody comes along and shows the error in this argument, I'll accept it. — sbi, Jun 22 '10 at 07:04
@sbi: I don't even think that the writes will have to be reordered by the hardware; the reference and the flag might be separated in memory enough to reside on different cache lines. — rjnilsson, Jun 22 '10 at 07:39
Yes, that, too, could be a problem. Thanks for answering my question! — sbi, Jun 23 '10 at 11:20

Nikko · Answer 2 · 2010-06-02T09:40:39.707

5

Here is my take (if really you can't initialize it before threads are launched):

I've seen (and used) something like this to protect static initialization, using boost::once

#include <boost/thread/once.hpp>

boost::once_flag flag;

// get thingy
const Thingy & get()
{
    static Thingy thingy;

    return thingy;
}

// create function
void create()
{
     get();
}

void use()
{
    // Ensure only one thread get to create first before all other
    boost::call_once( &create, flag );

    // get a constructed thingy
    const Thingy & thingy = get(); 

    // use it
    thingy.etc..()          
}

In my understanding, this way all threads wait on boost::call_once except one that will create the static variable. It will be created only once and then will never be called again. And then you have no lock any more.

edited Jun 02 '10 at 09:40

answered Jun 02 '10 at 09:14

Nikko

3,792
1
23
42

I have the same understanding, looks very good, though I wonder if it would be possible to wrap it somehow so that `call_once` is not exposed and ensure it cannot be forgotten. – Matthieu M. Jun 02 '10 at 12:06
I know that there are ways to do this properly, and maybe `boost::once` is a cheap way of doing this. However, this doesn't answer my question whether the practice I have been using in the past is good enough for me to not to go back to the old code and fix all that. – sbi Jun 08 '10 at 13:29
To me it's ok, it works, that was just a solution to avoid the lock overhead – Nikko Jun 08 '10 at 13:55

R Samuel Klatchko · Answer 3 · 2010-06-22T07:08:35.007

So, the relevant part of the spec is 6.7/4:

An implementation is permitted to perform early initialization of other local objects with static storage duration under the same conditions that an implementation is permitted to statically initialize an object with static storage duration in namespace scope (3.6.2). Otherwise such an object is initialized the first time control passes through its declaration; such an object is considered initialized upon the completion of its initialization.

Assuming the second part holds (object is initialized the first time control passes through its declaration), your code can be considered thread safe.

Reading through 3.6.2, it appears the early initialization permitted is converting dynamic-initialization to static-initialization. Since static-initialization must happen before any dynamic-initialization and since I can't think of any way to create a thread until you get to dynamic-initialization, such an early initialization would also guarantee the constructor would get called a single time.

Update

So, in respect to calling the some_type constructor for the_const_thingy, your code is correct according to the rules.

This leaves the issue about overwriting the reference which is definitely not covered by the spec. That said, if you are willing to assume that references are implemented via pointers (which I believe is the most common way to do that), then all you are going to do is overwrite a pointer with the value that it already holds. So my take is that this should be safe in practice.

@Christopher: I don't think so. The current standard doesn't even acknowledge the existence of threads, so I don't think that phrase can be interpreted that way. It specifically doesn't say anything about trampling over an already initialized reference. — sbi, Jun 21 '10 at 12:12
If the standard doesn't acknowledge the existence of threads, you can't really make any assumptions about safety at all... — bdonlan, Jun 21 '10 at 15:32
@bdonlan: I wrote about that in my question. Did you even read it properly? — sbi, Jun 22 '10 at 07:28

score 0 · Answer 4 · answered Jun 02 '10 at 09:29

0

I am not standardista...

But for the use you mention, why not simply initialize them before any thread is created ? Many Singletons issues are caused because people use the idiomatic "single thread" lazy initialization while they could simply instantiate the value when the library is loaded (like a typical global).

The lazy fashion only makes sense if you use this value from another 'global'.

On the other hand, another method I've seen was to use some kind of coordination:

'Singleton' to be register their initialization method in a 'GlobalInitializer' object during library load time
'GlobalInitializer' to be called in 'main' before any thread is launched

though I may not be describing it accurately.

answered Jun 02 '10 at 09:29

Matthieu M.

251,718
39
369
642

I also agree, if you don't need the lazy-init, just initialize it before any thread is launched – Nikko Jun 02 '10 at 09:38
Matthieu: This was a several MLoC code base which was mostly developed not thread-aware and needed to be taken to MT. I went through and fixed dozens of function-local static objects in the way I described, because some code already was written this way and I preferred the code to be consistent. It seemed to work, but I am afraid it might break some time in the future, which is why I was asking. Initializing dozens of such function-local statics at app start might work (if I could be sure that all were caught), but would not be wanted. (For most runs, not even half of those were initialized.) – sbi Jun 08 '10 at 13:33

rjnilsson · Answer 5 · 2010-06-22T07:00:31.467

In brief, I think that:

The object initialization is thread-safe, assuming that "some_mutex" is fully constructed when entering "create_const_thingy".
The initialization of the object reference inside "use_const_thingy" is not guaranteed to be thread-safe; it might (as you say) be subject of getting initialized multiple times (which is less of a problem), but it might also be subject to word tearing which could result in undefined behaviour.

[I assume that the C++ reference is implemented as a reference to the actual object using a pointer value, which could in theory be read when partially written to].

So, to try and answer your question:

Safe enough in practice: Very likely, but ultimately depends on pointer size, processor architecture and code generated by the compiler. The crux here is likely to be whether a pointer-sized write/read is atomic or not.
Safe according to the rule: Well, there are no such rules in C++98, sorry (but you knew that already).

Update: After posting this answer I realized that it only focuses on a small, esoteric part of the real problem, and because of this decided to post another answer instead of editing the contents. I'm leaving the contents "as-is" as it has some relevance to the question (and also to humble myself, reminding me to think through things a bit more before answering).

score 0 · Answer 6 · answered Jun 13 '18 at 03:23

I've programmed enough interprocess sockets to have nightmares. In order to make anything thread-safe on a CPU with DDR RAM, you have to cache-line-align the data structure and pack up all of your global variables contiguously into as few cache lines as possible.

The problem with unaligned interprocess data and loosely packed globals are that it causes aliasing from cache misses. In CPUs that use DDR RAM, there are a (usually) a bunch of 64-byte cache lines. When you load a cache line, the DDR RAM will automatically load up a bunch more cache lines, but the first cache line is always the hottest. What happens with interrupts that occur at high speeds is that the cache page will act as a low-pass filter, just like in analog signals, and will filter out the interrupt data leading to COMPLETELY baffling bugs if you're not aware of whats going on. That same thing goes for global variables that are not packed up tightly; if it takes up multiple cache lines it will get out of sync unless you to take a snapshot of the critical interprocess variables and pass them through on the stack and the registers to ensure the data is synced up right.

The .bss section (i.e. where the global variables are stored, will get initialized to all zeros, but the compiler will not cache-line-align the data for you, you will have to do that yourself, which may also a good place to use the C++ Construct in Place. To learn the math behind fastest way to align pointers read this article; I'm trying to figure out if I came up with that trick. Here is what the code will look like:

inline char* AlignCacheLine (char* buffer) {
  uintptr_t offset = ((~reinterpret_cast<uintptr_t> (buffer)) + 1) & (63);
  return buffer + offset;
}

char SomeTypeInit (char* buffer, int param_1, int param_2, int param_3) {
  SomeType type = SomeType<AlignCacheLine (buffer)> (1, 2, 3);
  return 0xff;
}

const SomeType* create_const_thingy () {
  static char interprocess_socket[sizeof (SomeType) + 63],
              dead_byte = SomeTypeInit (interprocess_socket, 1, 2, 3);
  return reinterpret_cast<SomeType*> (AlignCacheLine (interprocess_socket));
}

In my experience, you will have to use a pointer, not a reference.

score -1 · Answer 7 · answered Jun 12 '12 at 20:10

-1

This seems to be the easiest/cleanest approach I can think of without needing all of the mutex shananigans:

static My_object My_object_instance()
{
    static My_object  object;
    return object;
}

// Ensures that the instance is created before main starts and creates any threads
// thereby guaranteeing serialization of static instance creation.
__attribute__((constructor))
void construct_my_object()
{
    My_object_instance();
}

answered Jun 12 '12 at 20:10

Chappelle

65
9

1

But that's C++11, which wasn't released when I wrote this question. – sbi Jun 12 '12 at 20:54
Regardless, if I was reading this thread today I would want to know the best/current solution to the problem. – Chappelle Jun 13 '12 at 13:37
However, I specifically did not ask for a "solution". I asked whether my old code is safe or needs to be fixed. – sbi Jan 01 '13 at 01:04

score -1 · Answer 8 · answered Jun 21 '10 at 07:44

-1

Just call the function before you start creating threads, thus guaranteeing the reference and the object. Alternatively, don't use such a truly terrible design pattern. I mean, why on earth have a static reference to a static object? Why even have static objects? There's no benefit to this. Singletons are a terrible idea.

answered Jun 21 '10 at 07:44

Puppy

138,897
33
232
446

`` What's so hard to understand about __"I am not interested in what I could do instead, I do really want to know about this"?__ I even made it bold and I also gave the exact reason I want to know this... – sbi Jun 21 '10 at 12:06
@sbi: Whoops. I didn't expect that the opening poster was actually still reading replies, since this topic is like, two weeks old. – Puppy Jun 21 '10 at 12:55
1

Just go ahead and _carefully_ read the question. It says that the poster started a bounty on it. (Also, whoever posts a question gets informed of any new answer anyway.) – sbi Jun 21 '10 at 14:24

Thread-safe initialization of function-local static const objects

8 Answers8

Linked