31

Like the following code :

int size = myGetSize();
std::string* foo;
foo = new std::string[size];
//...
// using the table
//...
delete[] foo;

I heard that such use (not this code precisely, but dynamic allocation as a whole) can be unsafe in some cases, and should be used only with RAII. Why?

Silverspur
  • 719
  • 8
  • 29
  • 7
    Say an exception is thrown before the call to `delete[]`. Then you have *undefined behaviour*. Also, `foo` contains no information re. what it points to (is it a pointer to a `string`? To an array of `strings`? Must `delete` be called? Or should somebody else do thaT?. – juanchopanza Jun 10 '14 at 08:21
  • 21
    Why _undefined behaviour_? Isn't it 'just' a memory leak? – alain Jun 10 '14 at 08:23
  • 1
    just have a look behind the meaning and reasoning of RAII and you will understand why your code is considered unsafe – MatthiasB Jun 10 '14 at 08:23
  • 1
    @alain No, it is *undefined behaviour*. Why? Because the standard says so. – juanchopanza Jun 10 '14 at 08:24
  • 2
    @alain: the Standard says undefined behavior, so it is so. In practice memory leak is the least of the consequences, there is also the fact that destructors will not be executed which in turn can lead to further memory leaks, resources staying locked, ... whatever the destructor was supposed to do. – Matthieu M. Jun 10 '14 at 08:30
  • 10
    @juanchopanza: no, not UB. it's merely a memory leak. for example, a common (and sometimes necessary) practice for singletons is allocate them dynamically and never destroy. – Cheers and hth. - Alf Jun 10 '14 at 08:30
  • 5
    @MatthieuM.: chapter & verse please – Cheers and hth. - Alf Jun 10 '14 at 08:31
  • @Cheersandhth.-Alf Strictly speaking you're right. It is not *always* UB, only if the program depends on side effects of the constructor call. I think a "memory leak" qualifies as one of those situations though. **§ 3.8 p4**. – juanchopanza Jun 10 '14 at 08:38
  • 1
    @MatthieuM. It is in Object Lifetime, **§ 3.8 p4**. – juanchopanza Jun 10 '14 at 08:39
  • @juanchopanza: thanks, just found it myself; I started at `delete expression` and obviously it was not there (since it specifies what the expression does, not what its absence does...) – Matthieu M. Jun 10 '14 at 08:40
  • 2
    @juanchopanza that section refers to "A program may end the lifetime of any object by reusing the storage which the object occupies or by explicitly calling the destructor for an object of a class type with a non-trivial destructor" - neither of those situations occurs in this code. The other text in that paragraph refers to behaviour in those two scenarios – M.M Jun 10 '14 at 08:49
  • 1
    Specifically, the memory allocated by `new std::string[125];` is not re-used for another purpose; so this paragraph is not triggered (that memory is just lost and never accessed again) – M.M Jun 10 '14 at 08:51
  • 2
    @MattMcNabb: The reading is not easy. This section details not what can be done once the lifetime ended, but how a lifetime ends. For objects with a trivial destructor, reusing the storage ends their lifetime, whereas for objects with a non-trivial destructor the lifetime ends when the destructor is executed. If you read the next sentence, it contains *the program is not required to call the destructor explicitly before the storage which the object occupies is reused or **released*** (emphasis mine), so we are not talking only about re-using. – Matthieu M. Jun 10 '14 at 08:52
  • 1
    @MatthieuM The lifetime never ends for the new'd strings. The last sentence says that there is only UB relating to *side-effects produced by the destructor*. But it is not possible that there is any code that relies on these strings being destroyed. – M.M Jun 10 '14 at 08:58
  • 9
    @MatthieuM.: I'm surprised to see you categorizing *memory-leak* as undefined-behaviour. No, it is not. Though UB *might* cause memory leak (such as deleting base class pointer whose destructor is not marked `virtual`), but mere memory leak doesn't invoke UB. – Nawaz Jun 10 '14 at 08:58
  • 2
    @Nawaz: I am not concerned about the memory-leak itself, more about the absence of execution of the destructors. This can easily lock a resource or keep a file handle live for no reason... whatever it is that the destructor was doing. – Matthieu M. Jun 10 '14 at 09:03
  • @MattMcNabb: I am not talking about `std::string` in particular but about the general case of `new[]` and `delete[]` since the OP asked about *not this code precisely, but dynamic allocation as a whole*. – Matthieu M. Jun 10 '14 at 09:05
  • 5
    @MatthieuM.: That is not UB. If you don't release a resource, it doesn't invoke UB. It just *leaks* it. – Nawaz Jun 10 '14 at 09:06
  • @MatthieuM so you are saying that the OP's code actually does not have UB? Anyway I have started a new question [here](http://stackoverflow.com/questions/24137006/does-a-memory-leak-cause-undefined-behaviour) - we shouldn't be having this sort of discussion in comments :) – M.M Jun 10 '14 at 09:09
  • 1
    I don't get the point of the first line of the program. The integer `size` is obtained, but not used in the sequel. Was the `125` two lines below actually meant to be `size`? – Marc van Leeuwen Jun 10 '14 at 14:23
  • Yes indeed! I have corrected that mistake. – Silverspur Jun 10 '14 at 15:33

9 Answers9

49

I see three main problems with your code:

  1. Use of naked, owning pointers.

  2. Use of naked new.

  3. Use of dynamic arrays.

Each is undesirable for its own reasons. I will try to explain each one in turn.

(1) violates what I like to call subexpression-wise correctness, and (2) violates statement-wise correctness. The idea here is that no statement, and not even any subexpression, should by itself be an error. I take the term "error" loosely to mean "could be a bug".

The idea of writing good code is that if it goes wrong, it wasn't your fault. Your basic mindset should be that of a paranoid coward. Not writing code at all is one way to achieve this, but since that rarely meets requirements, the next best thing is to make sure that whatever you do, it Isn't Your Fault. The only way you can systematically prove that it's not your fault is if no single part of your code is the root cause of an error. Now let's look at the code again:

  • new std::string[25] is an error, because it creates a dynamically allocated object which is leaked. This code can only conditionally become a non-error if someone else, somewhere else, and in every case, remembers to clean up.

    This requires, first of all, that the value of this expression be stored somewhere. This is happening in your case, but in more complex expressions it may be hard to prove that it will ever happen in all cases (unspecified evaluation order, I'm looking at you).

  • foo = new std::string[125]; is an error because again foo leaks a resource, unless the stars align and someone remembers, in every case and at the right time, to clean up.

The correct way of writing this code so far would be:

std::unique_ptr<std::string[]> foo(std::make_unique<std::string[]>(25));

Note that every single subexpression in this statement is not the root cause of a program bug. It Is Not Your Fault.

Finally, as for (3), dynamic arrays are a misfeature in C++ and should basically never be used. There are several standard defects relating just to dynamic arrays (and not considered worth fixing). The simple argument is that you cannot use arrays without knowing their size. You might say that you could use a sentinel or tombstone value to mark the end of an array dynamically, but that makes the correctness of your program value-dependent, not type-dependent, and thus not statically checkable (the very definition of "unsafe"). You cannot assert statically that It Wasn't Your Fault.

So you end up having to maintain a separate storage for the array size anyway. And guess what, your implementation has to duplicate that knowledge anyway so it can call destructors when you say delete[], so that's wasted duplication. The correct way, instead, is not to use dynamic arrays, but instead separate memory allocation (and make it customizable via allocators why we're at it) from element-wise object construction. Wrapping all this (allocator, storage, element count) into a single, convenient class is the C++ way.

Thus the final version of your code is this:

std::vector<std::string> foo(25);
Community
  • 1
  • 1
Kerrek SB
  • 428,875
  • 83
  • 813
  • 1,025
  • 3
    Note: there was a proposed `std::dynarray` class (which was put on hold or rejected). Some people argue that `std::vector` stores an extra capacity member and has resizing capabilities which are not needed in a number of cases and a trimmed-down version (without resizing) should exist. – Matthieu M. Jun 10 '14 at 08:44
  • 1
    @MatthieuM.: If you're on the Itanium ABI, `vector` is *still* better than a dynamic array when you have destructors. I agree though that a nice, dynamic, fixed-size array is missing. `dynarray` wasn't quite the right thing (I think it's in an experimental TS now). Boost probably has something appropriate. – Kerrek SB Jun 10 '14 at 08:46
  • IMHO the biggest problem is 0. Makes code much hard to read and maintain. Why do something that takes dozens of lines of code scattered about the file (or multiple files), when you can do the same job (in fact: a better job) in one or two lines? – M.M Jun 10 '14 at 08:52
  • @MattMcNabb: Sure - the less code you have, the less can be your fault. I hope that that came out in my post :-) – Kerrek SB Jun 10 '14 at 09:01
  • 2
    Note that `std::make_unique` is not part of the C++ standard yet (as of C++11). – Cheers and hth. - Alf Jun 10 '14 at 09:01
  • 1
    Re "Finally, as for (3), dynamic arrays are a misfeature in C++ and should basically never be used", that is too absolute advice. Within the world of programming languages,some have to use C++ to create the stuff other use. And similarly within C++, some have to use dynamic arrays and placement new and ungrokable TMP programming etc. to create the stuff others use. When the allowed C++ is reduced to a safe subset like C#, why not just use C# instead? Or Java, whatever. But even those languages are unsafe to many programmers. So on... – Cheers and hth. - Alf Jun 10 '14 at 09:07
  • @Cheersandhth.-Alf: dynamic arrays are not wrong because they aren't "enough like Java". They're wrong because they force you to de-encapsulate two pieces of information that must always go together (the array and its size). (Of course you can always do something that's Wrong if you know that you have to and what you're doing, but that's not the point of giving advice. All advice may be ignored when it doesn't suit.) – Kerrek SB Jun 10 '14 at 09:11
  • 1
    I don't see from where you get your rule 2. What do you even mean by "naked `new`"? – James Kanze Jun 10 '14 at 11:37
  • 2
    @Alf Can you point out a valid use of array new? (I presume that's what he meant by "dynamic arrays".) I've been writing C++ for some 25 years now, including implementing pre-standard containers along the lines of string and vector, and I've never found one. – James Kanze Jun 10 '14 at 11:39
  • @JamesKanze: A "naked `new`" is any `new` that occurs outside a specially curated and vetted list of library functions. As I said, the expression `new T` is a programming error *unless* you can prove that the resulting ownership is dealt with, and the surface of this proof should be constrained to as small a code space as possible. – Kerrek SB Jun 10 '14 at 13:13
  • 1
    @KerrekSB In that case, most `new` will be naked. Typically, as a user, you'll use `new` in two cases: you're implementing some sort of dynamic structure yourself (because none of the library types correspond to what you need), or you need a specific lifetime for the object, which doesn't correspond to anything the compiler understands implicitly. And in that case, it will almost always be a "naked new"; you're typically not in library code, but in the heart of the application. – James Kanze Jun 10 '14 at 15:43
  • @JamesKanze if "you need a specific lifetime for the object, which doesn't correspond to anything the compiler understands implicitly", it would probably be safer to use a unique pointer or shared pointers (depending on the usage) rather than a raw one unless you have really strict size/performance requirements. – JAB Jun 10 '14 at 15:47
  • @JAB If you need a specific lifetime for an object, then `unique_ptr` or `shared_ptr` will not work. The whole point is that the lifetime doesn't correspond to scope (nor to a collection of scopes). – James Kanze Jun 10 '14 at 16:41
  • @JamesKanze But you need to keep track of the pointer somehow, and whatever is keeping track of the pointer will exist in some scope somewhere (or potentially multiple scopes, if you're using shared pointers). In fact, for stuff like that, `weak_ptr` can come in very handy if certain users of the pointer don't necessarily know when it will be deleted (after all, setting a raw pointer to `NULL` after only works when all access is done through that one pointer, if you had created other pointers pointing to the same location there's no way for those pointers to know the first pointer was nulled). – JAB Jun 10 '14 at 17:50
  • @JAB I'm not sure what you mean by "you need to keep track of the pointer". Obviously, the object will be reachable somehow, so that it can receive events, or whatever, from what ever requires it to be around. And when an event occurs which causes its deletion, just as obviously, it will (in its destructor) notify all interested parties. As for `weak_ptr`, if you need it, that's proof that you shouldn't be using `shared_ptr`. – James Kanze Jun 10 '14 at 18:42
9

The code you propose is not exception-safe, and the alternative:

std::vector<std::string> foo( 125 );
//  no delete necessary

is. And of course, the vector knows the size later, and can do bounds checking in debug mode; it can be passed (by reference or even by value) to a function, which will then be able to use it, without any additional arguments. Array new follows the C conventions for arrays, and arrays in C are seriously broken.

As far as I can see, there is never a case where an array new is appropriate.

James Kanze
  • 142,482
  • 15
  • 169
  • 310
8

I heard that such use (not this code precisely, but dynamic allocation as a whole) can be unsafe in some cases, and should be used only with RAII. Why?

Take this example (similar to yours):

int f()
{
    char *local_buffer = new char[125];
    get_network_data(local_buffer);
    int x = make_computation(local_buffer);
    delete [] local_buffer;
    return x;
}

This is trivial.

Even if you write the code above correctly, somebody may come one year later, and add a conditional, or ten or twenty, in your function:

int f()
{
    char *local_buffer = new char[125];
    get_network_data(local_buffer);
    int x = make_computation(local_buffer);
    if(x == 25)
    {
        delete[] local_buffer;   
        return 2;
    }
    if(x < 0)
    {
        delete[] local_buffer; // oops: duplicated code
        return -x;
    }
    if(x || 4)
    {
        return x/4; // oops: developer forgot to add the delete line
    }
    delete[] local_buffer; // triplicated code
    return x;
}

Now, making sure the code has no memory leaks is more complicated: you have multiple code paths and each of them has to repeat the delete statement (and I introduced a memory leak on purpose, to give you an example).

This is still a trivial case, with only one resource (local_buffer), and it (naively) assumes the code throws no exceptions whatsoever, between the allocation and deallocation. The problem leads to unmaintainable code, when your function allocates ~10 local resources, can throw, and has multiple return paths.

More than that, the progression above (simple, trivial case extended to more complex function with multiple exit paths, extended to multiple resources and so on) is a natural progression of code in the development of most projects. Not using RAII, creates a natural way for developers to update the code, in a way that will decrease quality, over the lifetime of the project (this is called cruft, and is a Very Bad Thing).

TLDR: Using raw pointers in C++ for memory management is a bad practice (altough for implementing an observer role, an implementation with raw pointers, is fine). Resource management with raw poiners violates SRP and DRY principles).

utnapistim
  • 24,817
  • 3
  • 41
  • 76
2

There are two major downsides of it -

  1. new does not guarantee that the memory you are allocating is initialized with 0s or null. They will have undefined values unless you initialize them.

  2. Secondly, the memory is dynamically allocated, which means it is hosted in heap not in stack. The difference between heap and stack is that, stacks are cleared when the variable runs out of scope but heaps are not cleared automatically and also C++ does not contain a built in Garbage Collector, which means if any how the delete call is missed you are ended up with a memory leak.

brainless coder
  • 5,932
  • 1
  • 14
  • 34
  • 1
    1. Is not a problem here, and `new` can also be called in a way that zero-initializes built-in types. 2. RAII takes case of this de-allocation "problem". – juanchopanza Jun 10 '14 at 08:30
  • @juanchopanza , the question says "not this code precisely, but dynamic allocation as a whole", so i answered in a whole, not only for this case. – brainless coder Jun 10 '14 at 08:31
  • 1
    Then there is no problem. RAII lets you do dynamic allocation safely. – juanchopanza Jun 10 '14 at 08:31
2

the raw pointer is difficult to handle correctly, e.g. wrt. copying of objects.

it's much simpler and safer to use a well-tested abstraction such as std::vector.

in short, don't needlessly reinvent the wheel – others have already created some superb wheels that you're not likely to match in quality or price

Cheers and hth. - Alf
  • 135,616
  • 15
  • 192
  • 304
1

If the allocated memory is not freed when it's no longer necessary it will result in a memory leak. It is not specified what will happen to the leaked memory, but contemporary operating systems collect it when the program terminates. Memory leaks can be very dangerous because the system may run out of memory.

user3722371
  • 145
  • 6
  • 6
    In general, it is not only memory that can be leaked. It is all kinds of resources (reference counts, file handles etc.) – juanchopanza Jun 10 '14 at 08:25
1

The delete in the end could be skipped. The code shown is not "wrong" in the strictest sense, but C++ offers automatic memory management for variables as soon as their scope is left; using a pointer is not necessary in your example.

Codor
  • 16,805
  • 9
  • 30
  • 51
0

Have the allocation within a try block and the catch block should deallocate all allocated memory thus far and also on normal exit outside the exception block, and catch block should not fall through the normal execution block to avoid double deletion

Dr. Debasish Jana
  • 6,653
  • 3
  • 24
  • 55
-1

See JPL Coding standards. Dynamic memory allocation leads to unpredictable execution. I have seen problems from dynamic memory allocations in perfectly coded systems - that over time, there is memory fragmentation just like a hard disk. Allocating blocks of memory from the heap will take longer and longer, until it becomes impossible to allocate the requested size. At which point in time, you start getting NULL pointers returned and the entire program crashes because few if anyone tests for out-of-memory conditions. It is important to note, that by the book, you may have enough memory available, however the fragmentation thereof is what prevents allocation. This is addressed in .NET CLI, with the use of "handles" instead of pointers, where the runtime can garbage collect, using a mark-and-sweep garbage collector, move the memory around. During the sweep it compacts the memory to prevent fragmentation and updates the handles. Whereas pointers (memory addresses) cannot be updated. This is a problem though, because garbage collection is no longer deterministic. Though, .NET has added mechanisms to make it more deterministic. However, if you follow JPL's advice (section 2.5), you do not need a fancy garbage collection. You dynamically allocate all you need at initialization, then reuse the allocated memory, never freeing it, then there is no fragmentation risk and you can still have deterministic garbage collection.

user9170
  • 790
  • 6
  • 14