why compiler is defering std::list deallocation?

Question

I have the following code to test memory deallocation using a std::list container:

#include <iostream>
#include <list>
#include <string>

#include <boost/bind.hpp>

/* count of element to put into container
 */
static const unsigned long SIZE = 50000000;

/* element use for test
 */
class Element
{
public:
    Element()
    : mId(0)
    {}
    Element( long id )
    : mId(id)
    {}
    virtual ~Element()
    {
    }
    inline long getId() const
    {
        return this->mId;
    }

    inline bool operator<( const Element & rightOperand ) const
    {
        return this->mId < rightOperand.mId;
    }

    inline bool isEven() const
    {
        return 0 == ( this->mId & 1 );
    }

private:
    long mId;
};

typedef std::list< Element > Elements;

int main( int argc, char * argv[] )
{
    std::string dummy;
    {
        Elements elements;

        std::cout << "Inserting "<< SIZE << " elements in container" << std::endl;
        std::cout << "Please wait..." << std::endl;

        /* inserting elements
         */
        for( long i=0; i<SIZE; ++i )
        {
            elements.push_back( i );
        }

        std::cout << "Size is " << elements.size() << std::endl;
        std::getline( std::cin, dummy); // waiting user press enter

        /* remove even elements
         */
        elements.remove_if( boost::bind( & Element::isEven, _1 ) );

        std::cout << "Size is " << elements.size() << std::endl;
        std::getline( std::cin, dummy);
    }

    std::getline( std::cin, dummy);

    return 0;
}

Running this code gives me the following memory profile:

MemoryProfile

It looks like gcc is defering deallocation and in my test program, at the end it has no choice and deallocate memory before going back to command line.

Why deallocation happens so late ?

I've tried with a vector to test another container and the shrink-to-fit tricks works and deallocate freed memory when I expect it.

gcc 4.5.0, linux 2.6.34

How are you measuring the memory consumption? – NPE Nov 08 '11 at 12:07 — NPE, Nov 08 '11 at 12:07

Mike Seymour · Accepted Answer · 2011-11-08T12:17:15.577

8

Most operating systems (including Linux) only allow processes to allocate quite large chunks of memory, and not very small ones; even if it is possible, it is most likely more expensive to make many small allocations than a few large ones. Generally, the C++ library will acquire large chunks from the operating system, and use it's own heap manager to allocate small pieces of them to the program. The large chunks will usually not be returned to the operating system once they've been divided up like that; they will remain allocated to the process, and will be reused for future allocations.

list allocates memory in small chunks (one per node), and so usually the allocated memory won't be released until the program exits. vector might get its memory as a single large allocation directly from the operating system, in which case it will be released when its deallocated.

edited Nov 08 '11 at 12:17

answered Nov 08 '11 at 12:09

Mike Seymour

235,407
25
414
617

Linux doesn't set any lower limit on the size of an allocation. On the other hand, allocation at the system level can be relatively expensive, so any well written runtime will avoid doing so for small blocks. – James Kanze Nov 08 '11 at 12:12
@JamesKanze: I may be wrong, but I don't think you can allocate less than a page (typically 8k) at a time. In any case, you're correct that it would be very expensive to make many small allocations from the system, so you wouldn't want to even if you could. – Mike Seymour Nov 08 '11 at 12:15
On my system Vector do release the memory when exiting scope – Nelstaar Nov 08 '11 at 13:02
@Nelstaar: Yes, if it was large enough that its memory came from a single OS allocation, then the memory will be released at that point. If you were to create millions of very small vectors, then you'll probably see the same pattern you see with the `list`. – Mike Seymour Nov 08 '11 at 13:07
I can also release some memory using shrink-to-fit idom: Elements( elements ).swap( elements ); – Nelstaar Nov 08 '11 at 13:08
@Nelstaar: Yes, that allocates a new block of memory, copies the data, then frees the old one; the old block will be released in just the same way as if the vector had been destroyed. – Mike Seymour Nov 08 '11 at 13:10
@MikeSeymour You should be able to *request* something less than a page. It's quite possible, and even probable, that when you request four bytes, you get a page. Which would introduce a whole new category of inefficiencies due to bad locality (only using 4 bytes per page). – James Kanze Nov 08 '11 at 13:36

score 2 · Answer 2 · answered Nov 08 '11 at 12:10

What exactly is your graph showing? The destructor of std::list deallocates all of the memory, so that it can be reused elsewhere in the program, but deallocation won't necessarily return the memory to the system, where it can be used by other processes. Historically, at least, under Unix, once memory has been allocated to a process, it remains with that process until the process terminates. Newer algorithms may be able to actually return memory to the OS, but even then, things like fragmentation may prevent it from doing so—if you allocate, then free a really large block, it may be returned, but if you allocate a lot of little blocks (which is what std::list does), the runtime will in fact allocate large blocks from the OS, which it parcels out; such large blocks cannot be returned until all small blocks in them have been freed, and likely won't be returned even then.

score 1 · Answer 3 · answered Nov 08 '11 at 12:11

It depends on how you're measuring the memory usage. If it's measuring the process memory in use, this is what you might actually expect.

It's quite common for a program to request memory and have that assigned from the controlling environment (such as an operating system) to the process but, when the memory is freed, it doesn't necessarily get taken away from the process. It may be returned to a free pool within the process.

This was the way allocation used to work in the olden days. A call to brk or sbrk would increase the size of the heap by giving more memory to the process. That memory would be added to the arena from which malloc calls were satisfied.

But, free would return the memory to the arena, not necessarily back to the operating system.

I imagine something similar is happening in this case.

score 1 · Answer 4 · edited Jun 20 '20 at 09:12

Your memory profile is actually the process's address space consumption (the sum of mmap-ed pages, as e.g. given by /proc/self/statm or /proc/self/maps from the point of view of the process itself).

But when a C or C++ function release memory (previously allocated with malloc or new, which are using mmap to get memory from the Linux kernel) using free or delete, it is not given back to the system (using munmap -because that would be too slow or impractical [fragmentation issues] - but just kept as reusable for future malloc or new.

So deallocation did happen when requested by free but the memory is not given back to the system, but kept for future re-use.

If you really wanted the memory to be given back, write your own allocator (above mmap and munmap) but usually it is not worth the effort.

Perhaps using Boehm's GC could help (it is very useful, to avoid bothering about free-ing or delete -ing) if you explicitly call GC_gcollect() (but I am not sure of that), but you really should not care that much.

And your question is not technically related to gcc (it would be the same with another C++ compiler). It is related to malloc and new (i.e. to standard C & C++ libraries) under Linux.

why compiler is defering std::list deallocation?

4 Answers4