57

When we subtract a pointer from another pointer the difference is not equal to how many bytes they are apart but equal to how many integers (if pointing to integers) they are apart. Why so?

eje211
  • 2,393
  • 3
  • 25
  • 39
  • 10
    it is an abstraction. This way you do not need to know the size of the thing that the pointer points to, you can rely on the fact that p-- or p++ will move to the last or next object. – Ed S. Jul 13 '10 at 20:06

8 Answers8

76

The idea is that you're pointing to blocks of memory

+----+----+----+----+----+----+
| 06 | 07 | 08 | 09 | 10 | 11 | mem
+----+----+----+----+----+----+
| 18 | 24 | 17 | 53 | -7 | 14 | data
+----+----+----+----+----+----+

If you have int* p = &(array[5]) then *p will be 14. Going p=p-3 would make *p be 17.

So if you have int* p = &(array[5]) and int *q = &(array[3]), then p-q should be 2, because the pointers are point to memory that are 2 blocks apart.

When dealing with raw memory (arrays, lists, maps, etc) draw lots of boxes! It really helps!

corsiKa
  • 76,904
  • 22
  • 148
  • 194
  • 15
    By the way, if I can stress one thing, `pointers are hard`. Yes, seasoned C++ veterans can whip them in shape with ease, but learning exactly when and how to use them effectively is not just a simple task. – corsiKa Jul 13 '10 at 15:11
  • 12
    @glowcoder: so can seasoned C coders - which is perhaps more apposite in a question tagged C. – Jonathan Leffler Jul 13 '10 at 15:15
  • @Neil: FWIW, it took me a fair amount of thinking to "get" pointers, and I felt like I was doing something not quite normal for at least months afterwards. This was after learning about four assembly languages, so it wasn't a lack of knowledge of the underlying mechanism. Of course, this was in Pascal, where pointers looked like afterthoughts and tended not to be mentioned much in the books. – David Thornley Jul 13 '10 at 15:21
  • @David I learned assembler before C, and the transition seemed very simple. I still have dificulties with them when I go back to Delphi though - mostly trying to remember the wretched syntax. –  Jul 13 '10 at 15:23
  • @Paul you're right, because I'm using array syntax, it automatically dereferences it for me... I'll fix that :-) @Jonathan You're absolutely right on that! @Neil You're probably a smart guy, who is fortunate enough to deal with other smart guys. That doesn't change the fact that pointers are hard. :-) – corsiKa Jul 13 '10 at 15:28
  • @glowcoder: True; in fact, even seasoned C veterans have troubles with pointers sometimes, though they'd never admit it. They'd also never admit how insanely often they spend hours tracking down pointer-related bugs (yes, even *with* Valgrind)... – BlueRaja - Danny Pflughoeft Jul 13 '10 at 15:32
  • 3
    @glowcoder: you don't need the braces, e.g. `int *p = &array[5];` is perfectly fine, as is `int *p = array + 5;`. – Paul R Jul 13 '10 at 15:45
  • @David Thornley: Then no wonder you had trouble getting them. Pointers in Pascal have horrible syntax which makes them a nightmare to use. When I first tried to learn pointers (this was in Pascal), I tried for **four months** and had to give up, as I simply couldn't get what this all was about. Then, about a year later I learned C. The pointers were simply there, and I grokked them with ease, without really thinking about it. They just *made sense* from the beginning. Sometimes I wonder, do all these people who complain about pointers being "hard" have a Pascal background? – slacker Jul 13 '10 at 15:59
  • @slacker no, they either have a Java background, or they have no background. Note the homework tag. :-) – corsiKa Jul 13 '10 at 16:02
  • This is a bad explanation. The "mem" numbers are not addresses, unless the elements are byte-sized (e.g., char type), which is the worst possible way to show what is happening. With "mem" you actually mean the `array` variable and its indexes. An actual explanation would need to show both memory addresses and `array` variable indexes and have elements of more than byte size. The actual explanation is that subtraction of pointers divides the address difference with pointer element size, to keep the result in high-level language context. In fact, this is not helping clarity, for assembly people – kavadias Oct 11 '16 at 14:09
  • In addition, the explanation claims that `p=&array[5]`, for some strange reason, makes `*p` equal to 14, when `array[5]` is not shown(!!) and 14 is only the value of `array[11]`, which is totally irrelevant, at least in C. In C, you cannot have an array that starts from a positive index and `array[-1]` does not equal `array[sizeof(array)/sizeof(typeof(array[0]))-1]`, but in fact will go somewhere outside (before) `array`, in your stack or global memory! That is exactly the reason it is bad for C to have high-level semantics for pointer subtraction. Don't know what background you guys have... – kavadias Oct 11 '16 at 14:29
  • @kavadias I'm not sure what you're looking for. The array is 0 indexed. The first element in the array is on the far left. It is index 0. The address of this first element is 6, and the value is 18. The sixth element in the array (aka `array[5]`) is the farthest right. It is address 11 and its value is 14. I'm afraid you have failed to understand what my diagram represents. – corsiKa Oct 11 '16 at 17:49
  • @corsiKa, even so, I find the explanation bad, as I have already mentioned, because pointer subtraction *involves* the pointed type size, which disappears from your explanation, because you picked the marginal case of a char-type array. Making array-element addresses increase by one is misleading for me, anyway. – kavadias Nov 04 '16 at 15:51
  • @kavadias I don't know what you're talking about. Look at this code I just set up. I use ints, but I only have to add 2 to make it go up 2 in the array, not 8. http://ideone.com/Gz1odz – corsiKa Nov 04 '16 at 16:44
  • @corsiKa Maybe it'll help you understand what I am saying, if you print pointer `p` *in your code*, before and after adding 2. You will find that `p` increased by 8 (p is int *), in fact. That is what your answer does not explain! Thus, if `q = (unsigned long)p` for `p += 2`, the C compiler actually does `q += 2*sizeof(typeof(*p)); p = (int *)q` (only unsigned long is the actual form of most values in the machine and no surrounding assignments are needed). The involvement of sizeof(typeof(*p)) cannot be understood from your reply. – kavadias Nov 07 '16 at 01:00
  • @kavadias Why would I explain to OP something he already knows? I mean, you did read his question, right? Look, I'm done with this. I stand by my answer. If you want to edit it to add more about sizing, be my guest. That's what SE is for. – corsiKa Nov 07 '16 at 15:51
  • @corsiKa, I am not going to edit your answer, because that would require changing the byte-element array to something more ordinary (e.g., ints), and that might be too intrusive. As for why you should have explained the real mechanism (instead of what appears as an explanation, because of the byte-element array), think of this: if I only change the addressing of your array, so that elements are 4-byte each, does your reply make sense? – kavadias Nov 08 '16 at 14:21
39

Because everything in pointer-land is about offsets. When you say:

int array[10];
array[7] = 42;

What you're actually saying in the second line is:

*( &array[0] + 7 ) = 42;

Literally translated as:

* = "what's at"
(
  & = "the address of"
  array[0] = "the first slot in array"
  plus 7
)
set that thing to 42

And if we can add 7 to make the offset point to the right place, we need to be able to have the opposite in place, otherwise we don't have symmetry in our math. If:

&array[0] + 7 == &array[7]

Then, for sanity and symmetry:

&array[7] - &array[0] == 7
eruciform
  • 7,341
  • 1
  • 32
  • 45
11

So that the answer is the same even on platforms where integers are different lengths.

ptomato
  • 51,511
  • 13
  • 105
  • 149
  • 2
    So how do i remmber this stinky fact :( Drives me dead :( –  Jul 13 '10 at 15:06
  • But when u perform addition in pointers like *p+2 if its an int pointer it moves 4 positions forward..Why ? –  Jul 13 '10 at 15:09
  • 2
    @fahad: if you want the byte count between two addresses, you can cast the two addresses to `char *` and then take the difference. Where do you get the idea that the alternative is better? Did you program in assembler before C? The C mechanism is sound - array indexing depends on it (because a subscripting operation adds N to the base address, but that is N units of the size of what is in the array). – Jonathan Leffler Jul 13 '10 at 15:10
  • @fahad: Your '*p+2' example: if p is a 'short **p', then '*p+2' would move forward 4 bytes on most machines (all where sizeof(short)==2). If you meant '*(p+2)' which is equivalent to 'p[2]', then if your declaration is 'short *p', the address would be 4 bytes forward. Note that 'a[i]' === *(a+i)' (and, for the IOCCC fanatics, 'a[i] === i[a]'). – Jonathan Leffler Jul 13 '10 at 15:13
  • 1
    @fahad - *p+2 will add 2 to the contents of memory pointed by p. *(p+2) you can access the contents of memory pointed by (p + 2*sizeof(datatype of p)). It depends on the datatype of the pointer p. – Praveen S Jul 13 '10 at 15:14
  • 1
    it makes life a lot easier for the programmer, consider how much more work it would be if it was the alternative: if something in the program would change or you want to make it run correctly on different systems... – Emile Vrijdags Jul 13 '10 at 15:19
7

Say you have an array of 10 integers:

int intArray[10] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };

Then you take a pointer to intArray:

int *p = intArray;

Then you increment p:

p++;

What you would expect, because p starts at intArray[0], is for the incremented value of p to be intArray[1]. That's why pointer arithmetic works like that. See the code here.

Jeff Kelley
  • 18,594
  • 5
  • 67
  • 80
4

"When you subtract two pointers, as long as they point into the same array, the result is the number of elements separating them"

Check for more here.

Praveen S
  • 10,161
  • 2
  • 40
  • 67
  • nice quote :) i want its explanation ;P –  Jul 13 '10 at 15:13
  • 2
    @fahad: because pointers do not point to bytes necessarily, they point to objects of the type used in defining them. Pointer arithmetic is done in terms of numbers of those objects too. – JeremyP Jul 13 '10 at 15:17
  • 1
    @fahad - Probably you may want to tell us what is the size of basic datatypes according to you? int, char and float to start with. Based on that you can dig into pointer arithmetic which is different than normal arithmetic. (not fully but it has its rules ;)) – Praveen S Jul 13 '10 at 15:25
2

This way pointer subtraction behaves is consistent with the behaviour of pointer addition. It means that p1 + (p2 - p1) == p2 (where p1 and p2 are pointers into the same array).

Pointer addition (adding an integer to a pointer) behaves in a similar way: p1 + 1 gives you the address of the next item in the array, rather than the next byte in the array - which would be a fairly useless and unsafe thing to do.

The language could have been designed so that pointers are added and subtracted the same way as integers, but it would have meant writing pointer arithmetic differently, and having to take into account the size of the type pointed to:

  • p2 = p1 + n * sizeof(*p1) instead of p2 = p1 + n
  • n = (p2 - p1) / sizeof(*p1) instead of n = p2 - p1

So the result would be code that is longer, and harder to read, and easier to make mistakes in.

mwfearnley
  • 2,377
  • 1
  • 27
  • 29
1

When applying arithmetic operations on pointers of a specific type, you always want the resulting pointer to point to a "valid" (meaning the right step size) memory-address relative to the original starting-point. That is a very comfortable way of accessing data in memory independently from the underlying architecture.

If you want to use a different "step-size" you can always cast the pointer to the desired type:

int a = 5;
int* pointer_int = &a;
double* pointer_double = (double*)pointer_int; /* totally useless in that case, but it works */
das_weezul
  • 5,682
  • 2
  • 25
  • 31
  • 1
    "totally useless" correct! "but it works" maybe on your compiler, today... **but** the standard doesn't require anything except UB for dereferencing a (non-`[unsigned] char`) pointer on memory really allocated for an unrelated type. also if `double` has stricter alignment than `int`, then this could generate a trap (or again, any other UB). – underscore_d Apr 16 '16 at 22:14
0

@fahad Pointer arithmetic goes by the size of the datatype it points.So when ur pointer is of type int you should expect pointer arithmetic in the size of int(4 bytes).Likewise for a char pointer all operations on the pointer will be in terms of 1 byte.

The Stig
  • 551
  • 2
  • 5
  • 15
  • 1
    `size of int(4 bytes)` The size of `int` is platform dependent, it doesn't have to be 4 bytes. Actually on the system I'm using an `int` is 2 bytes long. Check this article for instance [size of long int on different architectures](http://software.intel.com/en-us/articles/size-of-long-integer-type-on-different-architecture-and-os) – H_squared Oct 10 '13 at 12:26