3

When I try this:

#include <functional>
#include <iostream>
#include <memory>
#include <set>

template<class T>
struct MyAlloc
{
    typedef std::allocator<T> Base;
    typedef typename Base::value_type value_type;
    typedef typename Base::pointer pointer;
    typedef typename Base::const_pointer const_pointer;
    typedef typename Base::reference reference;
    typedef typename Base::const_reference const_reference;
    typedef typename Base::size_type size_type;
    typedef typename Base::difference_type difference_type;
    Base a;
    MyAlloc() { }
    template<class U> MyAlloc(MyAlloc<U> const &) { }
    template<class U> struct rebind { typedef MyAlloc<U> other; };
    pointer allocate(size_type n, void const * = NULL)
    {
        std::cout << "Allocating " << n << " objects" << std::endl;
        return this->a.allocate(n);
    }
    void deallocate(pointer p, size_type n) { return this->a.deallocate(p, n); }
};

int main(int argc, char *argv[])
{
    std::set<int, std::less<int>, MyAlloc<int> > set;
}

I see Allocating 1 objects.

But I don't understand -- why is this heap allocation necessary? Stack memory is sufficient for default-constructing other containers (like std::vector), so why do set and map require heap allocation?

user541686
  • 189,354
  • 112
  • 476
  • 821
  • Are you using MS VC++? – Maxim Egorushkin Dec 29 '13 at 21:39
  • @MaximYegorushkin: Yes. – user541686 Dec 29 '13 at 21:41
  • Good question. I don't think it's "necessary". I see no dynamic allocations when I run this on my machine (OS X with clang). –  Dec 29 '13 at 21:44
  • @H2CO3: Interesting... if it wasn't necessary I feel like it'd be such an obvious optimization that Dinkumware would put it in. Makes me wonder why it's not there, I feel like it might break some code but I'm not sure what it would be. – user541686 Dec 29 '13 at 21:46
  • I recall there once was a trick of a `static` global node that somehow marked empty or end instead of `nullptr` so implementations could aboid extra branches in some cases: if so, maybe this is the fix to that (it had issues) with a per-`map` node instead of a global one... Not sure why it would need to be on the heap however? – Yakk - Adam Nevraumont Dec 29 '13 at 22:05
  • @Yakk: Actually I think I figured it out! Will post soon... – user541686 Dec 29 '13 at 22:22
  • IIRC, MS VC++/Dinkumware empty collections still allocated memory. Not sure if this is still true. – Maxim Egorushkin Dec 29 '13 at 22:23

2 Answers2

0

The C++ standard certainly doesn't mandate that memory is allocated for a default constructed object. Why an implementation might choose to allocate in the default constructor of a std::map<...> I don't know. One reason could be to make sure that it doesn't try to embed a potentially large allocator into the stack allocated object. You'd need to look at the implementation to see why it allocates memory on the heap.

Dietmar Kühl
  • 141,209
  • 12
  • 196
  • 356
0

I think I figured it out myself. Visual C++ seems to be correct, and Clang and GCC seem to be wrong.

It's because swap should not invalidate iterators for std::set or std::map. If we try this code:

#include <set>
#include <iostream>
int main()
{
    std::set<int> a, b;
    std::set<int>::iterator end = a.end();
    a.swap(b);
    b.insert(end, 1);
    std::cout << b.size() << std::endl;
    return 0;
}

If the head of the tree was stored on the stack, then end would become invalidated after the swap.

Visual C++ handles it just fine, but GCC and Clang loop infinitely (at least on my versions).

Edit: The above may have been the reason until C++03 due to an ambiguity, but it is no longer the case since C++11 -- please see the comments below.

user541686
  • 189,354
  • 112
  • 476
  • 821
  • Could you cite the corresponding paragraph of the standard mandating that swap should not invalidate iterators for std::set or std::map, please? Will you submit a bug report or shall I do it? – Ali Dec 29 '13 at 22:32
  • 5
    Nitpick: Clang and GCC are not C++ library implementations, they can both use libstdc++ and Clang prefers its own libc++. MSVC, in contrast, can use only its own torn apart Dinkumware implementation. I say torn apart because there are a ton of compiler bug workaround present in MSVC's library code. Also, you are wrong (see note 2 [here](http://stackoverflow.com/a/6442829/256138)). – rubenvb Dec 29 '13 at 22:36
  • 1
    See the answer above the one I linked to, note 1, for c++03. Also, typing these things on a phone, while possible, is hard and time-consuming ;-) – rubenvb Dec 29 '13 at 22:39
  • @rubenvb: Nice to see that question referenced :) – Lightness Races in Orbit Dec 29 '13 at 23:31
  • 1
    @rubenvb: Wow, thanks for the pointer. I guess the explanation is probably that they were trying to avoid breaking programs that relied on `end` iterators staying valid. Thanks! – user541686 Dec 29 '13 at 23:41
  • @Mehrdad that sounds like something Microsoft would do, yes `;-)`. – rubenvb Dec 30 '13 at 12:23