26

I believe I understand how normal variables and pointers are represented in memory if you are using C.

For example, it's easy to understand that a pointer Ptr will have an address, and its value will be a different address, which is the space in memory it's pointing to. The following code:

int main(){
    int x = 10;
    int *Ptr;
    Ptr = &x;
return 0;
}

Would have the following representation in memory:

+---------------------+-------------+---------+
| Variable Name       | Address     | Value   | 
+---------------------+-------------+---------+
| x                   | 3342        | 10      |
+---------------------+-------------+---------+
| Ptr                 | 5466        | 3342    |
+---------------------+-------------+---------+

However I find it difficult to understand how arrays are represented in memory. For example the code:

int main(){
    int x[5];
        x[0]=12;
        x[1]=13;
        x[2]=14;

    printf("%p\n",(void*)x);
    printf("%p\n",(void*)&x);

return 0;
}

outputs the same address twice (for the sake of simplicity 10568). Meaning that x==&x. Yet *x (or x[0] in array notation) is equal to 12, *(x+1) (or x[1] in array notation) is equal to 13 and so on. How can this be represented? One way could be this:

+---------------------+-------------+----------+----------------------+
| Variable Name       | Address     | Value    | Value IF array       |
+---------------------+-------------+----------+----------------------+
| x                   | 10568       | 10568    | 12                   |
+---------------------+-------------+----------+----------------------+
|                     | 10572       |          | 13                   | 
+---------------------+-------------+----------+----------------------+
|                     | 10576       |          | 14                   | 
+---------------------+-------------+----------+----------------------+
|                     | 10580       |          | trash                | 
+---------------------+-------------+----------+----------------------+
|                     | 10584       |          | trash                | 
+---------------------+-------------+----------+----------------------+

Is this close to what happens, or completely off?

Mankarse
  • 37,343
  • 9
  • 88
  • 138
Daniel Scocco
  • 6,428
  • 12
  • 44
  • 74
  • 3
    [Relevant.](http://blogs.oracle.com/ksplice/entry/the_ksplice_pointer_challenge) – emboss Oct 21 '11 at 01:23
  • Out of interest, in the first example why did you give `x` and `Ptr` addresses so far apart? There's nothing to forbid an implementation from doing that, but typically the automatic variables for a given function will all be located close together, on the "stack". – Steve Jessop Oct 21 '11 at 01:33
  • It was just to avoid confusion and highlight the point. I know they are usually much closer. – Daniel Scocco Oct 21 '11 at 01:36
  • I wonder if it's the same for std::vector. I heard that at some point there was some update so that it guaranteed vectors to be contiguous in memory but still I'm wondering what is the memory representation... – jokoon Oct 21 '11 at 07:43

8 Answers8

34

An array is a block of contiguous objects with no spaces in between. This means that x in your second example is represented in memory as:

+---------------------+-------------+---------+
| Variable Name       | Address     | Value   | 
+---------------------+-------------+---------+
| x                   | 10568       | 12      |
|                     |             +---------+
|                     |             | 13      |
|                     |             +---------+
|                     |             | 14      |
|                     |             +---------+
|                     |             | ??      |
|                     |             +---------+
|                     |             | ??      |
+---------------------+-------------+---------+

That is, x is five ints big, and has a single address.

The weird part about arrays isn't in how they're stored - it's how they're evaluated in expressions. If you use an array name somewhere that it isn't the subject of the unary & or sizeof operators, it evaluates to the address of its first member.

That is, if you just write x, you will get a value 10568 with type int *.

If, on the other hand you write &x, then the special rule doesn't apply - so the & operator works like it normally does, which means that it fetches the address of the array. In the example, this will be a value 10568 with type int (*)[5].

The reason that x == &x is that the address of the first member of an array is necessarily equal to the address of the array itself, since an array starts with its first member.

caf
  • 216,678
  • 34
  • 284
  • 434
22

Your diagram is correct. The weirdness around &x has nothing to do with how arrays are represented in memory. It has to do with array->pointer decay. x by itself in value context decays into a pointer to its first element; i.e., it is equivalent to &x[0]. &x is a pointer to an array, and the fact that the two are numerically equal is just saying that the address of an array is numerically equal to the address of its first element.

Raymond Chen
  • 42,606
  • 11
  • 86
  • 125
2

Yes, you've got it. A C array finds the indexed value x[y] by calculating x + (y * sizeof(type)). x is the starting address of the array. y * sizeof(type) is an offset from that. x[0] produces the same address as x.

Multidimensional arrays are similarly done, so int x[y][z] is going to consume sizeof(int) * y * z memory.

Because of this you can do some stupid C pointer tricks. It also means getting the size of an array is (almost) impossible.

Schwern
  • 127,817
  • 21
  • 150
  • 290
0

A C array is just a block of memory that has sequential values of the same size. When you call malloc(), it is just granting you a block of memory. foo[5] is the same as *(foo + 5).

Example - foo.c:

#include <stdio.h>

int main(void)
{
    int foo[5];
    printf("&foo[0]: %tx\n", &foo[0]);
    printf("foo: %tx\n\n", foo);
    printf("&foo[3]: %tx\n", &foo[3]);
    printf("foo: %tx\n", foo + 3);
}

Output:

$ ./foo
&foo[0]: 5fbff5a4
foo: 5fbff5a4

&foo[3]: 5fbff5b0
foo: 5fbff5b0
ObscureRobot
  • 7,140
  • 2
  • 25
  • 35
  • Give or take, statically-allocated arrays [aren't identical to pointers](http://www.lysator.liu.se/c/c-faq/c-2.html), just equivalent in usage. (Since the OP is trying to figure out the details of C might as well be pedantic about everything.) – millimoose Oct 21 '11 at 01:28
  • Nope. If we have `int *foo`, `foo + 5` will point to the sixth integer after `foo`. With 8-bit bytes and 32 bit integers, `5 * sizeof(*foo)` will add 20 to `foo`, resulting in accessing the 21st integer after `foo` which may actually be out of bounds. For your statement to be correct, you need `*((int *)(((char *) foo) + 5*(sizeof(*foo)))`, give or take a few parentheses `;-)` – Sinan Ünür Oct 21 '11 at 01:29
0

An array in C is a sequential block of memory with each member's block of the same size. This is why pointers work, you seek an offset based on the first member's address.

alex
  • 438,662
  • 188
  • 837
  • 957
0

The Arrays and Pointers section in the C FAQ has some helpful information.

Sinan Ünür
  • 113,391
  • 15
  • 187
  • 326
0

Daniel,

this is not difficult. You have the basic idea and there's no much difference in memory representation of arrays. if you declare an array, say

     void main(){
         int arr[5]={0,1,2,3,4};


     }

you have initialized(defined) the array. So the five elements will be stored in five adjacent locations in memory. you can observe this by referencing the memory address of each element. Not like other primitive data types in C, an array identifier(here, arr) itself represents its pointer. The idea seems vague if you are a beginner but you will feel comfortable as you go on.

      printf("%d",arr);

this line will show you the memory address of the first element, arr[0]. This is similar to referencing the address of the first element.

      printf("%d",&arr[0]);

now, you can view memory locations of all elements. The following piece of code will do the job.

    int i;
    for(i=0;i<5;i++){
       printf("location of %d is %d\n",arr[i],&arr[i]);
    } 

you will see each address increments by gaps of four.(if your integers are 32 bits long). So you can easily understand how the arrays are stored in the memory.

you can also try the same thing using a different method.

    int i;
    for(i=0;i<5;i++){
       printf("location of %d is %d\n",*(a+i),a+i);
    }

you will get the same set of answers in both cases and try to get the equivalence.

try the same experiment using different data types(char, float and struct types). You will see how the gaps between adjacent elements vary based on the size of a single element.

Tharindu Rusira
  • 679
  • 8
  • 16
-1

int x[] produces the same result as int* x;

it's just a pointer

therefore notations x[i] and *(x + i) produce the same result.

  • int x[] is not a pointer at all. It's an array and only the fact that arrays can be treated as pointers to the first array element yields the same result as if it was a pointer in the first place already. – Christian Oct 21 '11 at 08:06
  • [Arrays are not pointers.](http://stackoverflow.com/questions/4810664/how-do-i-use-arrays-in-c) – R. Martinho Fernandes Oct 21 '11 at 08:38