Freaky way of allocating two-dimensional array?

Question

In a project, somebody pushed this line:

double (*e)[n+1] = malloc((n+1) * sizeof(*e));

Which supposedly creates a two-dimensional array of (n+1)*(n+1) doubles.

Supposedly, I say, because so far, nobody I asked could tell me what this does, exactly, nor where it originated from or why it should work (which allegedly, it does, but I'm not yet buying it).

Perhaps I'm missing something obvious, but I'd appreciate it if somebody could explain above line to me. Because personally, I'd feel much better if we'd use something we actually understand.

For the record, that *is* the one and only way of allocating an actual 2D array dynamically. — Quentin, Apr 22 '16 at 13:06

score 89 · Accepted Answer · answered Apr 22 '16 at 12:51

89

The variable e is a pointer to an array of n + 1 elements of type double.

Using the dereference operator on e gives you the base-type of e which is " array of n + 1 elements of type double".

The malloc call simply takes the base-type of e (explained above) and gets its size, multiplies it by n + 1, and passing that size to the malloc function. Essentially allocating an array of n + 1 arrays of n + 1 elements of double.

answered Apr 22 '16 at 12:51

Some programmer dude

363,249
31
351
550

1

I don't understand. That malloc is not allocating enough space for [n+1]*[n+1] ? – Martin James Apr 22 '16 at 12:57
3

@MartinJames `sizeof(*e)` is equivalent to `sizeof(double [n + 1])`. Multiply that with `n + 1` and you get enough. – Some programmer dude Apr 22 '16 at 12:58
This even works if `n` is a function argument. At least with `-std=gnu99` or `-std=c99`. – jdarthenay Apr 22 '16 at 13:00
25

@MartinJames: What's wrong with it? It's not that eye-stabby, it guarantees that allocated rows are contiguous, and you can index it like any other 2D array. I use this idiom a lot in my own code. – John Bode Apr 22 '16 at 13:11
3

It may seem obvious, but this only works for *square* arrays (same dimensions). – Jens Apr 22 '16 at 17:09
19

@Jens: Only in the sense that if you put `n+1` for both dimensions, the result will be square. If you do `double (*e)[cols] = malloc(rows * sizeof(*e));`, the result will have have whatever number of rows and columns you specified. – user2357112 supports Monica Apr 22 '16 at 21:47
9

@user2357112 Now that I would much rather see. Even if it means you have to add `int rows = n+1` and `int cols = n+1`. God save us from clever code. – candied_orange Apr 22 '16 at 23:19
I guess such an expression limits the "dynamism" of widening/stretching the array in just either rows or columns, but not both. (Well, for C89 as [this question](http://stackoverflow.com/questions/12462615/how-do-i-correctly-set-up-access-and-free-a-multidimensional-array-in-c?lq=1) states that C99 supports the left-hand `double (*e)[variable]` syntax.) – logo_writer Apr 24 '16 at 00:50
1

@JoachimPileborg `sizeof(*e)` is technically UB here - http://stackoverflow.com/q/32985424/2757035 - so I've got to prefer Lundin's answer as it shows an infinitely safer way of sizing the `malloc`. – underscore_d Apr 24 '16 at 15:24

Lundin · Answer 2 · 2016-04-22T12:57:25.493

57

This is the typical way you should allocate 2D arrays dynamically.

e is an array pointer to an array of type double [n+1].
sizeof(*e) therefore gives the type of the pointed-at type, which is the size of one double [n+1] array.
You allocate room for n+1 such arrays.
You set the array pointer e to point at the first array in this array of arrays.
This allows you to use e as e[i][j] to access individual items in the 2D array.

Personally I think this style is much easier to read:

double (*e)[n+1] = malloc( sizeof(double[n+1][n+1]) );

edited Apr 22 '16 at 12:57

answered Apr 22 '16 at 12:51

Lundin

155,020
33
213
341

12

Nice answer except I disagree with your suggested style, preferring the `ptr = malloc(sizeof *ptr * count)` style. – chux - Reinstate Monica Apr 22 '16 at 13:35
Nice answer, and I like your preferred style. A slight improvement might be to point out you need to do it this way because there might be padding between the rows that needs to be taken into account. (At least, I think that's the reason you need to do it this way.) (Let me know if I'm wrong.) – davidbak Apr 22 '16 at 17:27
@davidbak There will not be any padding anywhere unless you use structs. Or if you use something like `uint_fast8_t`. – Lundin Apr 22 '16 at 18:41
ok, then why not just `malloc((n+1)*(n+1)*sizeof(double))` then? I thought the issue was having a single general use expression, not the particular problem of a 2d array of doubles? – davidbak Apr 22 '16 at 18:48
2

@davidbak That's the same thing. The array syntax is merely self-documenting code: it says "allocate room for a 2D array" with the source code itself. – Lundin Apr 22 '16 at 18:52
OK. Though I myself think that `malloc((n+1)*(m+1)*sizeof(double))` is also sufficiently self-documenting, in any case where you're dealing with primitive types thus no padding. And doesn't the decl `double e[n+1][]` work for a ptr to 2D array? – davidbak Apr 22 '16 at 19:07
By the way, isn't `double (*e)[n+1]` a declaration of e to be a pointer to a 1D array of n+1 doubles? For a pointer to a 2D array you need `double (*e)[n+1][]`, I think. – davidbak Apr 22 '16 at 19:08
1

@davidbak Note: A minor disadvantage of [comment](http://stackoverflow.com/questions/36794202/freaky-way-of-allocating-2dim-array/36794373#comment61176498_36794360) `malloc(row*col*sizeof(double))` occurs when `row*col*sizeof()` overflows, but not `sizeof()*row*col` does not. (e.g. row,col are `int`) – chux - Reinstate Monica Apr 22 '16 at 19:51
@chux - let me guess - you like to work with _big data!!_ You're right, of course. – davidbak Apr 22 '16 at 20:28
7

@davidbak: `sizeof *e * (n+1)` is easier to maintain; if you ever decide to change the base type (from `double` to `long double`, for example), then you only need to change the declaration of `e`; you don't need to modify the `sizeof` expression in the `malloc` call (which saves time and protects you from changing it in one place but not the other). `sizeof *e` will always give you the right size. – John Bode Apr 22 '16 at 23:48
@chux according to the thread linked in the comments on the OP - http://stackoverflow.com/a/32985668/2757035 - that dereference of an 'in-initialisation' pointer is UB. So I prefer Lundin's syntax, and +1 for that and the clearer explanation imho. – underscore_d Apr 24 '16 at 15:18
@underscore_d A weakness to this style is maintenance: the `double` and `n+1` must be kept consistent with the declaration of `e` which may be declared many lines earlier. As with many style issues, what is absolute best is in the eye of the beholder. IMO, the core problem is that OP and the original code author are not coding from a common group's style guide. – chux - Reinstate Monica Apr 26 '16 at 14:16
@chux Yeah, and due to `*e` promoting DRY, I agree it should be fixed in the Standard if possible. I was just bringing up its technical status and my resulting preference for the alternative. I try to avoid things that look nice but might _in theory_ break somewhere - unlikely as that might be. – underscore_d Apr 26 '16 at 15:25
@underscore_d The debate about which is best of `ptr = malloc(sizeof *ptr)` versus `ptr = malloc(constant)` is of such minor importance that it is not even worth mentioning on Stack Overflow. The only way you would screw up either version is if you are doing something mighty strange, in which case your malloc style preference is unlikely to help you anyhow. When programming, DRY will not help you if you don't even know what you are doing. – Lundin Apr 26 '16 at 15:30
@Lundin Agreed, hence my original opinion. I probably wouldn't have ever dreamed up the `*e` syntax since I have no aversion to the alternative, but hey. – underscore_d Apr 26 '16 at 15:34

score 41 · Answer 3 · edited Apr 26 '16 at 18:49

This idiom falls naturally out of 1D array allocation. Let's start with allocating a 1D array of some arbitrary type T:

T *p = malloc( sizeof *p * N );

Simple, right? The expression *p has type T, so sizeof *p gives the same result as sizeof (T), so we're allocating enough space for an N-element array of T. This is true for any type T.

Now, let's substitute T with an array type like R [10]. Then our allocation becomes

R (*p)[10] = malloc( sizeof *p * N);

The semantics here are exactly the same as the 1D allocation method; all that's changed is the type of p. Instead of T *, it's now R (*)[10]. The expression *p has type T which is type R [10], so sizeof *p is equivalent to sizeof (T) which is equivalent to sizeof (R [10]). So we're allocating enough space for an N by 10 element array of R.

We can take this even further if we want; suppose R is itself an array type int [5]. Substitute that for R and we get

int (*p)[10][5] = malloc( sizeof *p * N);

Same deal - sizeof *p is the same as sizeof (int [10][5]), and we wind up allocating a contiguous chunk of memory large enough to hold a N by 10 by 5 array of int.

So that's the allocation side; what about the access side?

Remember that the [] subscript operation is defined in terms of pointer arithmetic: a[i] is defined as *(a + i)¹. Thus, the subscript operator [] implicitly dereferences a pointer. If p is a pointer to T, you can access the pointed-to value either by explicitly dereferencing with the unary * operator:

T x = *p;

or by using the [] subscript operator:

T x = p[0]; // identical to *p

Thus, if p points to the first element of an array, you can access any element of that array by using a subscript on the pointer p:

T arr[N];
T *p = arr; // expression arr "decays" from type T [N] to T *
...
T x = p[i]; // access the i'th element of arr through pointer p

Now, let's do our substitution operation again and replace T with the array type R [10]:

R arr[N][10];
R (*p)[10] = arr; // expression arr "decays" from type R [N][10] to R (*)[10]
...
R x = (*p)[i];

One immediately apparent difference; we're explicitly dereferencing p before applying the subscript operator. We don't want to subscript into p, we want to subscript into what p points to (in this case, the array arr[0]). Since unary * has lower precedence than the subscript [] operator, we have to use parentheses to explicitly group p with *. But remember from above that *p is the same as p[0], so we can substitute that with

R x = (p[0])[i];

or just

R x = p[0][i];

Thus, if p points to a 2D array, we can index into that array through p like so:

R x = p[i][j]; // access the i'th element of arr through pointer p;
               // each arr[i] is a 10-element array of R

Taking this to the same conclusion as above and substituting R with int [5]:

int arr[N][10][5];
int (*p)[10][5]; // expression arr "decays" from type int [N][5][10] to int (*)[10][5]
...
int x = p[i][j][k];

This works just the same if p points to a regular array, or if it points to memory allocated through malloc.

This idiom has the following benefits:

It's simple - just one line of code, as opposed to the piecemeal allocation method

T **arr = malloc( sizeof *arr * N );
if ( arr )
{
  for ( size_t i = 0; i < N; i++ )
  {
    arr[i] = malloc( sizeof *arr[i] * M );
  }
}

All the rows of the allocated array are *contiguous*, which is not the case with the piecemeal allocation method above;
Deallocating the array is just as easy with a single call to free. Again, not true with the piecemeal allocation method, where you have to deallocate each arr[i] before you can deallocate arr.

Sometimes the piecemeal allocation method is preferable, such as when your heap is badly fragmented and you can't allocate your memory as a contiguous chunk, or you want to allocate a "jagged" array where each row can have a different length. But in general, this is the better way to go.

^{1. Remember that arrays are not pointers - instead, array expressions are converted to pointer expressions as necessary.}

+1 I like the way you present the concept: allocating a series of elements is possible for any type, even if those elements are arrays themselves. — logo_writer, Apr 24 '16 at 00:27
Your explanation is really good, but note that the allocation of contiguous memory is not a benefit untill you really need it. Contiguous memory is more expensive that non-contiguous one. For simple 2D arrays there is no difference in memory layout for you (except for the number of the lines for allocation and deallocation), so prefer using non-contiguous memory. — Oleg Lokshyn, Apr 27 '16 at 08:31
@John Bode what is best way (if its possible) to return `int (*p)[10][5] = malloc( sizeof *p * N);` from a function. But I want to preserve ar[x][y] notation. — CoR, Dec 11 '20 at 21:33
@CoR: If I understand your question correctly, you'd just return `p`. The function prototype would be `int (*foo(int N))[10][5]` (`foo` is a function that takes an `int` parameter `N` and returns a pointer to a 10x5 array of `int`). — John Bode, Dec 11 '20 at 23:30
@John Bode I need to avoid that `int (*foo(int N))[10][5]` prototype. 10 and 5 will be provided later by user. Is is possible with this notation to create C function that "returns" array or pointer to malloced array, or pointer to pointer? — CoR, Dec 12 '20 at 13:52

Freaky way of allocating two-dimensional array?

3 Answers3

Linked

Related