11

Static locals are guaranteed to be instantiated at first use by the C++ standard. However, I'm wondering what happens if I access a static local object while it is beeing constructed. I assume that this is UB. But what are the best practices to avoid this in the following situation?

A problem situation

The Meyers Singleton pattern uses a static local in a static getInstance() method to construct the object on the first use. Now if the constructor (directly or indireclty) calls getInstance() again, we face a situation where the static initialization is not yet completed. Here is a minimal example, that illustrates the problem situation:

class StaticLocal {
private:
    StaticLocal() {
        // Indirectly calls getInstance()
        parseConfig();
    }
    StaticLocal(const StaticLocal&) = delete;
    StaticLocal &operator=(const StaticLocal &) = delete;

    void parseConfig() {
        int d = StaticLocal::getInstance()->getData();
    }
    int getData() {
        return 1;
    }

public:
    static StaticLocal *getInstance() {
        static StaticLocal inst_;
        return &inst_;
    }

    void doIt() {};
};

int main()
{
    StaticLocal::getInstance()->doIt();
    return 0;
}

In VS2010, this works without problems, but VS2015 deadlocks.

For this simple, reduced situation, the obvious solution is to direclty call getData(), without calling getInstance() again. However, in more complex scenarios (as my actual situation), this solution is not feasible.

Attempting a solution

If we change the getInstance() method to work on a static local pointer like this (and thus abandon the Meyers Singleton pattern):

static StaticLocal *getInstance() {
    static StaticLocal *inst_ = nullptr;
    if (!inst_) inst_ = new StaticLocal;
    return inst_;
}

It is clear that we get an endless recursion. inst_ is nullptr on the first invokation, so we call the constructor with new StaticLocal. At this point, inst_ is still nullptr as it will only get assigned when the constructor finishes. However, the constructor will call getInstance() again, finding a nullptr in inst_, and thus call the constructor again. And again, and again, ...

A possible solution is to move the constructor's body into the getInstance():

StaticLocal() { /* do nothing */ }

static StaticLocal *getInstance() {
    static StaticLocal *inst_ = nullptr;
    if (!inst_) {
        inst_ = new StaticLocal;
        inst_->parseConfig();
    }
    return inst_;
}

This will work. However, I'm not happy with this situation, as a constructor should, well, construct a complete object. It is debateable if this situation can be made an exception, as it is a singleton. However, I dislike it.

But what's more, what if the class has a non-trivial destructor?

~StaticLocal() { /* Important Cleanup */ }

In the above situation, the destructor is never called. We loose RAII and thus one important distinguishing feature of C++! We are in a world like Java or C#...

So we could wrap our singleton in some sort of smart pointer:

static StaticLocal *getInstance() {
    static std::unique_ptr<StaticLocal> inst_;
    if (!inst_) {
        inst_.reset(new StaticLocal);
        inst_->parseConfig();
    }
    return inst_.get();
}

This will correctly call the destructor on program exit. But it forces us to make the destructor public.

At this point, I feel I'm doing the compiler's job...

Back to the original question

Is this situation really undefined behaviour? Or is it a compiler bug in VS2015?

What is the best solution to such a situation, prefably without dropping a full constructor, and RAII?

king_nak
  • 10,729
  • 30
  • 55
  • Might be this: http://stackoverflow.com/questions/32079095/vs2015-c-static-initialization-crash-possible-bug?rq=1 – H. Guijt Mar 09 '16 at 10:51
  • Why do you want your getInstance function to return a pointer? working with references is okay IMO. – iFarbod Mar 09 '16 at 11:19
  • 1
    The parseConfig is a member function, you can write `int d = getData();`. – Marian Spanik Mar 09 '16 at 12:40
  • @H.Guijt That question has to do with wrong CLR/Subsystem settings causing to crash before main is called. In my problem, that's not an issue (Debug, x86, Console App) – king_nak Mar 09 '16 at 12:51
  • @iFarbod You're right. I inherited that code, and that's how it's written. But returning pointer or reference is not an issue here – king_nak Mar 09 '16 at 12:53
  • @MarianSpanik You are right. But I metioned in my question, my actual case is more complex, where other classes used in `parseConfig` call `getData` (and other methods). I could rewrite the code to pass a `StaticLocal` pointer/reference to that classes's constructors, but that would be quite a rework... – king_nak Mar 09 '16 at 12:54
  • 2
    So someone tries to access an object which is not yet constructed, while it is being constructed. What do you *want* to happen? – n. 'pronouns' m. Mar 14 '16 at 15:01

7 Answers7

14

This leads to undefined behaviour by c++ 11 standard. The relevant section is 6.7:

If control enters the declaration concurrently while the variable is being initialized, the concurrent execution shall wait for completion of the initialization. If control re-enters the declaration recursively while the variable is being initialized, the behavior is undefined.

The example from the standard is bellow:

int foo(int i) {
    static int s = foo(2*i); // recursive call - undefined
    return i+1;
}

You are facing the deadlock since MSVC inserts mutex lock/unlock to make static variable initialization thread safe. Once you call it recursively, you are locking the same mutex two times in the same thread, what leads to dead lock.

This is how static initialization is implemented internally in llvm compiler.

The best solution IMO is to do not use singletons at all. Significant group of developers tend to think that singleton is anti-pattern. The issues like you mentioned is really hard to debug, because it occurs before main. Because order of globals initialization is undefined. Also, multiple translation units might be involved, so compiler won't catch this types of errors. So, when I faced the same problem in production code, I was obliged to remove all of the singletons.

If you still think that singleton is the right way to go, then you need to re-structurize your code somehow when your singleton object owns (holds them as members, for example) all the classes that calls GetInstance during the singleton initialization. Think of your classes like ownership tree, where the singleton is the root. Pass reference to parent, when you create a child, if child needs it.

Community
  • 1
  • 1
ivaigult
  • 4,619
  • 5
  • 27
  • 48
7

The problem is that inside the class, you should be using "this" instead of calling getInstance, in particular:

void parseConfig() {
    int d = StaticLocal::getInstance()->getData();
}

Should simply be:

void parseConfig() {
    int d = getData();
}

The object is a singleton because the constructor is private and thus the user cannot construct an arbitrary number of objects. It's bad design to write the whole class assuming there will be only one instance of the object ever. At some point somebody may stretch the concept of a singleton like this:

static StaticLocal *getInstance(int idx) {
    static StaticLocal inst_[3];
    if (idx < 0 || idx >= 3)
      throw // some error;
    return &inst_[idx];
}

When that happens, it's much easier to update the code if there aren't calls to getInstance() throughout the class.

Why do changes like this happen? Imagine you were writing a class 20 years ago to represent the CPU. Of course there will only ever be one CPU in the system, so you make it a singleton. Then, suddenly, multi-core systems become commonplace. You still want only as many instances of the CPU class as there are cores in the system, but you won't know until the program is run how many cores are actually on a given system.

Moral of the story: Using the this pointer not only avoids recursively calling getInstance(), but future proofs your code as well.

Tom Penny
  • 195
  • 3
2

Actually this code in its current form is stuck into 3-way infinite recursion. Hence it will never work.

getInstance() --> StaticLocal()
 ^                    |  
 |                    |  
 ----parseConfig() <---

To let it work, anyone of the above 3 methods has to compromise and come out of the vicious circle. You judged it right, parseConfig() is the best candidate.

Let's assume that all the recursive content of constructor is put into parseConfig() and non-recursive contents are retained in constructor. Then you may do following (only relevant code):

    static StaticLocal *s_inst_ /* = nullptr */;  // <--- introduce a pointer

public:
    static StaticLocal *getInstance() {
      if(s_inst_ == nullptr)
      {   
        static StaticLocal inst_;  // <--- RAII
        s_inst_ = &inst_;  // <--- never `delete s_inst_`!
        s_inst_->parseConfig();  // <--- moved from constructor to here
      }   
      return s_inst_;
    }   

This works fine.

iammilind
  • 62,239
  • 27
  • 150
  • 297
  • 1
    This gives the best balance for the concerns for me. Big plus is that the destructor can remain private, and RAII is used to destruct the singleton – king_nak Mar 22 '16 at 11:16
1

One straight forward way of solving this is to separate the responsibilities, in this case "whatever StaticLocal is supposed to do" and "reading the configuration data"

class StaticLocal;

class StaticLocalData
{
private:
  friend StaticLocal;
  StaticLocalData()
  {
  }
  StaticLocalData(const StaticLocalData&) = delete;
  StaticLocalData& operator=(const StaticLocalData&) = delete;

  int getData()
  {
    return 1;
  }

public:
  static StaticLocalData* getInstance()
  {
    static StaticLocalData inst_;
    return &inst_;
  }
};

class StaticLocal
{
private:
  StaticLocal()
  {
    // Indirectly calls getInstance()
    parseConfig();
  }
  StaticLocal(const StaticLocal&) = delete;
  StaticLocal& operator=(const StaticLocal&) = delete;

  void parseConfig()
  {
    int d = StaticLocalData::getInstance()->getData();
  }

public:
  static StaticLocal* getInstance()
  {
    static StaticLocal inst_;
    return &inst_;
  }

  void doIt(){};
};

int main()
{
  StaticLocal::getInstance()->doIt();
  return 0;
}

This way, StaticLocal does not call itself, the circle is broken.

Also, you have cleaner classes. If you move the implementation of StaticLocal into a separate compile unit, users of static local won't even know that the StaticLocalData thingy exists.

There is a good chance that you will find that you do not need the functionality of StaticLocalData to be wrapped into a Singleton.

Rumburak
  • 3,246
  • 12
  • 25
1

All versions of the C++ standard have a paragraph that makes this undefined behaviour. In C++98, Section 6.7 para 4.

An implementation is permitted to perform early initialization of other local objects with static storage duration under the same conditions that an implementation is permitted to statically initialize an object with static storage duration in namespace scope (3.6.2). Otherwise such an object is initialized the first time control passes through its declaration; such an object is considered initialized upon the completion of its initialization. If the initialization exits by throwing an exception, the initialization is not complete, so it will be tried again the next time control enters the declaration. If control reenters the declaration (recursively) while the object is being initialized, the behavior is undefined.

All subsequent standards have essentially the same paragraph (only differences are inconsequential - such as section numbering for cross-referencing, etc).

What you have done is implement the constructor of your singleton so it calls the function which constructs it. getInstance() creates the object, the constructor (indirectly) calls getInstance(). Hence it runs afoul of the last sentence in the quote above, and introduces undefined behaviour.

The solution, as with anything recursive, is to either reimplement so recursion does not occur, or to prevent interference between the first call and any recursive calls.

There are three ways to achieve this.

The first, which you have said you don't want, is to construct an object and then parse data to initialise it (two-stage construction).

The second is to parse the data first, and only construct the object if the parsed data is valid (i.e. suitable for use in constructing the object).

The third is for the constructor to handle the parsing (which you are trying to do) but, if parsed data is invalid, to force the constructor to fail (which your code does not do).

An example of the third is to leave the getInstance() alone, and restructure the constructor so it never calls getInstance().

static StaticLocalData* getInstance()
{
    static StaticLocalData inst_;
    return &inst_;
}

StaticLocalData::StaticLocalData()
{
    parseConfig();
}

void StaticLocalData::parseConfig()
{
     int data = getData();    // data can be any type you like

     if (IsValid(data))
     {
          //   this function is called from constructor so simply initialise
          //    members of the current object using data
     }
     else
     {
           //   okay, we're in the process of constructing our object, but
           //     the data is invalid.  The constructor needs to fail

           throw std::invalid_argument("Construction of static local data failed");
     }
}

In the above, IsValid() represents a function or expression that checks if the parsed data is valid.

This approach actually exploits the second last sentences in the paragraph I quoted above from the standard. It has the effect of ensuring that calling staticLocal::getInstance() repeatedly will keep resulting in an exception until the parsing succeeds. Once the parsing has succeeded, the object will exist, and no further attempt will be made to it (it's address will simply be returned instead).

If the caller does not catch the exception, the effect is simple - the program will terminate(). If the caller does catch the exception, it should not try to use the pointer.

 try
 {
       StaticLocal *thing = StaticLocal::getInstance();

       //  code using thing here will never be reached if an exception is thrown

 }
 catch (std::invalid_argument &e)
 {
       // thing does not exist here, so can't be used
       //     Worry about recovery, not trying to use thing
 }

So, yes, your approach introduces undefined behaviour. But same part of the standard that makes the behaviour undefined also provides the basis for a solution.

Peter
  • 32,539
  • 3
  • 27
  • 63
-1

see How to implement multithread safe singleton in C++11 without using <mutex>

singleton declaration in c++11 is thread safe by standard. In VS2015 it may be implemented by mutex.

So, you last solution is fully applicable

StaticLocal() { /* do nothing */ }

static StaticLocal *getInstance() {
   static StaticLocal inst_; 
   std::call_once(once_flag, [&inst_]() {inst_.parseConfig(); return &inst_;});
   return &inst_;
}

about destructor: you can register you singleton destructor by using int atexit(void (*function)(void));. This applied in Linux and may be exist in Win too, as function from standard library.

Community
  • 1
  • 1
Sergey_Ivanov
  • 237
  • 1
  • 8
  • This isn't thread-safe - after `inst_` is (safely) initialized to `nullptr`, multiple threads could enter the if block and race to construct "the" singleton. – dhaffey Mar 19 '16 at 21:53
  • yes, you right, sorry for my haste. fix for my answer: `static StaticLocal inst_; std::call_once(once_flag, [&inst_]() {inst_.parseConfig(); return &inst_;})` – Sergey_Ivanov Mar 21 '16 at 06:21
-1

In terms of dtor, I think you don't have to worry about it. Once you define it, then it will be automatically called after main() exits.

robotician
  • 42
  • 2