4

Why would you ever want to have an array on the heap? My professor gave us two reasons:

  1. To pass the array to functions, instead of passing a copy
  2. So that the array outlives the scope

Can't these both instead by solved by:

  1. Passing a pointer to an array on the stack
  2. Returning the value of the array instead of the array itself (i.e. use the copy constructor)

Could someone give me an example of where an array in the heap has to be used?

dfg
  • 657
  • 1
  • 6
  • 21
  • What do you mean by "the value of the array instead of the array itself"? You pass arrays by passing pointers to their first element. The pointer has to point _to_ something, and if that _something_ is gone when the stack unwinds, you haven't found a way for "the array [to outlive] the scope". – Joshua Taylor Nov 05 '13 at 19:35
  • @JensGustedt I listed my reasons above.... – dfg Nov 05 '13 at 19:35
  • @JoshuaTaylor Returning the array using a copy contractor. – dfg Nov 05 '13 at 19:35
  • 1
    Allocating large arrays on the stack is not a good idea. – Charlie Burns Nov 05 '13 at 19:35
  • Related: http://stackoverflow.com/q/5258724/420683 or for C++: http://stackoverflow.com/q/599308/420683 – dyp Nov 05 '13 at 19:36
  • Copy constructor => C++, not C – dyp Nov 05 '13 at 19:36
  • @DyP But even in C++, there's no _constructor_ for an array, right? dfg, you could use something like a `std::vector`, but a copy constructor still has to make a copy of something, and that something has to be somewhere in memory, and the new object has to be somewhere. Even if the something is on the stack, where's the new array? It's going to be heap allocated, right? – Joshua Taylor Nov 05 '13 at 19:38
  • @JoshuaTaylor There's `std::array` ;) – dyp Nov 05 '13 at 19:39
  • @DyP Some Googling tells me that `std::array` has been around since C++11. No wonder I didn't think of it. Pardon me though, I have to go get some kids off of my lawn. – Joshua Taylor Nov 05 '13 at 19:40
  • @JoshuaTaylor That there's `std::array` now doesn't mean that *arrays* have ctors now. The statement just has become ambiguous ;) ("raw" arrays vs. `std::array`s) – dyp Nov 05 '13 at 19:42
  • 3
    The stack is small, the heap is big. – Guido Nov 05 '13 at 19:43
  • If you load up an array with data, need to queue it off to another thread and then continue to load more data, you could load a stack array and copy that large array into a wide queue while holding a lock on the queue, (heavy CPU use copying plus lock held for a long time, increasing the chance of contention). Another way would be to malloc a heap array, fill it, queue off the pointer and immediately malloc another one, so reseating the local pointer, (no bulk copy, lock only held for long enough to push one pointer). – Martin James Nov 05 '13 at 21:08

6 Answers6

4

Arrays in heap are used to outlive the function's scope. Passing a pointer to an array on the stack is only valid if you don't want to use it later in a previous (upper) caller. And you can't return an array from a function, you can return a pointer to an array, but if it was allocated in stack, it will point to an invalid memory position after the function returns.

The 1st reason is wrong: arrays are never passed by copy. When you call a function, array names always decay into a pointer to its first element, precisely to avoid copying the whole array. If you want to pass an array by copy, you have to embed it inside a struct and pass a struct instead.

Dynamic array allocation is also useful if you don't know the size of your array in advance (although this is not true after C99 brought variable length arrays - but still, variable length arrays are alloced on stack, so you'd have the same problem).

Another good reason to use heap allocation is that you can easily fall out of stack memory for very big arrays. The heap is generally larger.

Filipe Gonçalves
  • 19,404
  • 6
  • 42
  • 65
1

An array in C is represented as a pointer that references the location of the array data (it points to the first item in the array). In the case of stack-based arrays, the array pointer and data are in the same location. In the case of heap-allocated arrays, the array pointer is on the stack and points to the location on the heap where the array data begins.

For point (2), you cannot return the value of the array. What is returned instead is the location of the array in memory or on the stack. Thus, allocating it on the heap ensures that the data is preserved when returning the array from a function.

A std::vector on the other hand works functionally like an array. With this, the array data is allocated on the heap, but the object that manages the array is on the stack. Thus, the lifetime of the array is controlled by the lifetime of the vector object.

The std::vector has the behaviour you describe:

  1. passing a vector by value to a function causes the data to be copied when passing it to the function;

  2. the vector data only lives for the lifetime of the function.

Passing the vector from a function can cause the array data to be copied. However, this can be optimised using things like return value optimisation and R-value references, which avoid the copy.

reece
  • 7,312
  • 1
  • 22
  • 26
  • 2
    It might be a good approach to mention the exceptions to this rule: first, the declaration of an array gives you an array, and the declaration of a pointer gives you a pointer - they are not interchangeable. What happens is that array names decay into pointer to first element in expressions, except when: a) it's the operand of `sizeof`; b) you take its address with the `&` operator; c) it is a literal string initializer. And, of course, these exceptions apply to array names. If you're in a function and received it as an argument, you have a pointer, not an array. – Filipe Gonçalves Nov 05 '13 at 19:45
  • "An array in C is represented as a pointer that references the location of the array data (it points to the first item in the array)" - this is not true. No storage is allocated for a separate pointer to the array. The only storage that's allocated is for the array elements themselves; the pointer value is *inferred* from the array expression. – John Bode Nov 05 '13 at 21:41
0
#include <assert.h>
#include <stdlib.h>

int * f(int* array) {
    assert(array[0] == 1); // OK

    int static_array[] = {1, 2, 3};
    //return static_array = {1, 2, 3}; //BAD: only lives in this function

    int * dynamic_array = malloc(sizeof(int) * 2);
    dynamic_array[0] = 1;
    dynamic_array[1] = 2;
    return dynamic_array; // OK: lives outside also
}

int main()
{
    int static_array[] = {1, 2, 3};
    int * returned_array;
    returned_array = f(static_array);
    assert(returned_array[0] == 1);
    free(returned_array);
}
0

If this code runs without crashing you may allocate all your arrays on the stack.

#include <string.h>
int main() {
    volatile char buf[1024 * 1024 * 64];
    memset(buf, 0, sizeof(buf));
}
Joshua
  • 34,237
  • 6
  • 59
  • 120
  • 640K = 640*1024 (or 640*1000..) Or is it now "64 MB ought to be enough for everyone"? ;) – dyp Nov 05 '13 at 19:56
  • DyP: Most systems only allow for ~1MB of stack. – Joshua Nov 05 '13 at 20:28
  • Yes, by default. I just tried and tweaked the executable (increase max stack size) to actually make your program run ;) – dyp Nov 05 '13 at 20:58
  • I once used an environment that had no heap or heap-like allocation. While I could dynamically resize arrays, I could only do so from the function that declared them and not its children. – Joshua Nov 05 '13 at 21:10
0

Unless you are required to let the array outlive the scope of the function that declares and initialise it, the compiler can do some optimisations that will most likely end up being more efficient then what a programmer can guess. Unless you have time to benchmark and experiment AND that your application is performance critical, leave the optimisation to the compiler.

Sebastien
  • 1,379
  • 12
  • 27
0

Reasons you would want to allocate an array on the heap instead of the stack:

  1. The array is very large;
  2. The array's lifetime is outside the scope of any one function;
  3. The array size is not known at compile time, and VLAs are either not available or cannot be used in a particular situation (VLAs cannot be declared static or at file scope, for example);
  4. The array is meant to be resizable.
John Bode
  • 106,204
  • 16
  • 103
  • 178