1

I've been digging into memory allocation and pointers in C. I was under the impression that if you do not allocate enough memory for a value and then try to put that value in that memory cell, the program would either crash or behave incorrectly. But what I get is a seemingly correct output where I'd expect something else.

#include <stdio.h>
#include <stdlib.h>

int main()
{
    // Here we intentionally allocate only 1 byte,
    // even though an `int` takes up 4 bytes
    int * address = malloc(1);

    address[0] = 16777215; // this value surely takes more than 3 bytes. It cannot fit 1 byte.
    address[1] = 1337; // just for demo, let's put a random other number in the next memory cell.

    printf("%i\n", address[0]); // Prints 16777215. How?! Didn't we overwrite a part of the number?

    return 0;
}

Why does this work? Does malloc actually allocate more than the number of bytes that we pass to it?

EDIT

Thanks for the comments! But I wish to note that being able to write to unassigned memory is not the part that surprises me and it's not part of the question. I know that writing out of bounds is possible and it is "undefined behavior".

For me, the unexpected part is that the line address[1] = 1337; does not in any way corrupt the int value at address[0].

It seems that the explanations for this diverge, too.

  • @Mini suggests that the reason for this is that malloc actually allocates more than what's passed, because of cross-platform differences.

  • @P__J__ in the comments says that address[1] for some reason points to the next sizeof(int) byte, not to the next byte. But I don't think I understand what controls this behavior then, because malloc doesn't seem to know about what types we will put into the allocated blocks.

EDIT 2

So thanks to the comments, I believe I understand the program behavior now. The answer lies in the pointer arithmetic. The program "knows" that an address pointer is of type int, and therefore adding 1 to it (or accessing via address[1]) gives an address of the block that lies 4 (sizeof(int)) bytes ahead.

And if we really wanted, we could move just one byte and really corrupt the value at address[0] by coercing address to char * as described in this answer

Thanks to all and to @P__J__ and @Blastfurnace in particular!

timetowonder
  • 4,151
  • 3
  • 27
  • 41
  • 5
    Welcome to the wild and wacky world of c development, where it's possible to write into any part of memory even if that is not what you intended. – Robert Harvey Mar 08 '19 at 20:10
  • 4
    C Is like a hotel where you're assigned a room but none of the rooms are locked. You may notice the adjacent room isn't being used by anyone and put your stuff in there, too, but don't be surprised when someone shows up and cleans it out because it's not your room and shouldn't have anything in it. – tadman Mar 08 '19 at 20:17
  • 1
    Compile with GCC and `-fsanitize=address` and it crashes. Then compile with GCC, `-O3` and `-fsanitize=address` and it stops crashing. Since the *behaviour is undefined*, in the `-O3` case the compiler produces a program that works *as if* the correct amount of memory was allocated... – Antti Haapala Mar 08 '19 at 20:26
  • 3
    This is UB but address[1] does not point to the next byte (char) but to next integer. So writing to address[1] will write to the location of the address[1] + sizeof(int) bytes - not one byte ahead. So writing there cannot overwrite the previous object. So you invoke only the UB by woriting to the unallocated memory but you do not have any chance overwrite the previos object. You nned to learn about pointers and its arithmetic – 0___________ Mar 08 '19 at 20:26
  • @timetowonder "Why does this work?" --> Did you expect the code to reliably detect the out of array range usage? What "something else." did you expect? – chux - Reinstate Monica Mar 08 '19 at 20:38
  • Thanks for the comments! I tried to clear up what exactly confuses in the edit. – timetowonder Mar 08 '19 at 21:10
  • 1
    `malloc` doesn't know about types but `address[1]` absolutely knows about types. It's an `int*` and dereferencing it gives a reference to `int`. The address arithmetic, in this case, works in whole `int` units, `address[1]` is `sizeof(int)` bytes past `address[0]`. – Blastfurnace Mar 08 '19 at 21:19
  • @Blastfurnace Oh, okay, this makes sense, too. Thank you. – timetowonder Mar 08 '19 at 21:24

2 Answers2

2

malloc often allocates more than you actually ask for (all system/environment/OS dependent), which is why it works in you scenario (sometimes). However, this is still undefined behavior it can actually allocate only 1 byte (and you are writing to what may not be allocated heap memory).

Mini
  • 418
  • 5
  • 14
2

C doesn't mandate any kinds of bounds checking on array accesses, and it's possible to overflow storage and write into memory you don't technically own. As long as you don't clobber anything "important", your code will appear to work as intended.

However, the behavior on buffer overruns is undefined, so the results will not generally be predictable or repeatable.

John Bode
  • 106,204
  • 16
  • 103
  • 178