C, Malloc, Pointers and Context of Exectution

Question

Edit : Read this first : https://stackoverflow.com/a/8800541/14795595

I have this code :

#include <assert.h>
#include <stddef.h>
#include <string.h>
#include <stdlib.h>
typedef struct{
  double x;
  double y;
} point;

point *inserer_point( unsigned *len, point chemin[], point p, unsigned pos ){
  assert( pos <= *len );
  printf("%d",*len);

  if( chemin == NULL )
    assert( *len == 0 && pos == 0 );

  chemin = realloc( chemin,  (*len + 1) * sizeof( point ) );
  assert( chemin );

  memmove( chemin + pos + 1, chemin + pos, sizeof(point)*( *len - pos ) );
  chemin[pos] = p;
  (*len)++;

  return chemin;
}

int main(){
  point *c=NULL;
  unsigned l = 0;

  c = inserer_point( &l, c, (point){.x = 4, .y = 6}, 0);
  c = inserer_point( &l, c, (point){.x = 5, .y = 7}, 0);
  c = inserer_point( &l, c, (point){.x = 6, .y = 8}, 2);
  c = inserer_point( &l, c, (point){.x = -7, .y = -9}, 1);
  c = inserer_point( &l, c, (point){.x = -4, .y = -6}, 4);
  c = inserer_point( &l, c, (point){.x = -44, .y = 9}, 4);
  c = inserer_point( &l, c, (point){.x = -444, .y = -69}, 2);
         
}

As you can see, l is declared in main without a malloc, a calloc or a realloc. Which means it is declared in stack. And we don't have control over it.

It should be read only and can only be modified in the context of execution (in the main function).

However, we send a pointer to l in the other function as *len.

And then we increment len (*len++) at the bottom of the function.

As I said, it should not be possible since it is not on the heap and should be read only.

But this code works and len gets incremented.

Am I wrong about memory access ? What did I not get ? Thank you !

EDIT 2:

This is pretty similar returns SEGMENTATION FAULT. Why ?

void disp (int t[], int a, int b) {
    for (int i = a; i < b - 1; i++) {
        printf ("%d, ", t[i]);
        }
    printf("%d\n", t[b - 1]);
}

int *build (int a, int n) {
    int t[n];
    for (int i = 0; i < n; i++) {
        t[i] = a + i;
    }
    printf ("t : %p : ", t);
    disp (t, 0, 15);
    return t;
}


int main(void){
    printf ("\nbuild tab\n");
    int *t = build (0, 15);
    printf ("tab : %p\n", t);
    disp (t, 0, 15); // SEG_FAULT!
    return 0;

}

*"It should be read only and can only be modified in the context of execution (in the main function)."* - whatever gave you that idea? There is no restriction that would prevent you from passing the address of a variable to a different function — UnholySheep, May 26 '21 at 16:29
@UnholySheep I learned that variables not using malloc are stored in stack. And we can't manage stack except in the context of execution. Or is this only for pointers ? — Rayan Dev, May 26 '21 at 16:31
Automatic ("stack") variables like your `l` are not read-only: you can modify `l` in your main using `l =42;` but also by taking their pointer `unsigned int *p = &l;` and then `*p = 42;`. You can also pass the pointer to other functions, and use the pointer, until the automatic variable ends its life. — chi, May 26 '21 at 16:32
@chi Is it recommended to do so ? Do we know when the automatic variable ends its life ? Is it better to malloc before passing to other functions ? — Rayan Dev, May 26 '21 at 16:37
Life ends when the end-of-block `}` is met, for automatic variables, or when `free(p)` is called, for dynamically allocated objects. The latter is less predictable, in general. When possible, as in this case, using automatic variables is simpler. Don't use `malloc` just because you need a pointer, use it e.g. when you need the object to outlive the current function, or you need a rather large object which won't fit the stack, etc. — chi, May 26 '21 at 16:47
For example here, "l" needs to outlive the current function, ie main no ? So I should use malloc ? — Rayan Dev, May 26 '21 at 16:51
@RayanDev No. It'll remain in existence throughout the program. I'd suggest you to use a debugger and see the stack frames. — Shubham, May 26 '21 at 16:57
See this: [What gets allocated on the stack and the heap?](https://software.codidact.com/posts/277536) — Lundin, May 26 '21 at 17:10
"This code works" is debatable. `assert()` is not for handling (expected) runtime errors, only bugs. — Deduplicator, May 26 '21 at 17:14

Vlad from Moscow · Answer 1 · 2021-05-26T16:40:49.517

1

You passed the object l by reference to the function inserer_point.

c = inserer_point( &l, c, (point){.x = 4, .y = 6}, 0);
                   ^^

In C passing by reference means passing an object indirectly through a pointer to it.

So dereferencing the pointer within the function you have a direct access to the pointed object and can change it.

Here is a simple demonstrative program.

#include <stdio.h>

void f( int *px )
{
    *px = 20;
}

int main(void) 
{
    int x = 10;
    
    printf( "Before calling f x is equal to %d\n", x );
    
    f( &x );
    
    printf( "After  calling f x is equal to %d\n", x );

    return 0;
}

The program output is

Before calling f x is equal to 10
After  calling f x is equal to 20

That is it is unimportant where an object is defined (allocated). You can use a pointer to the object to change it by means of dereferencing the pointer that gives you an access to the memory where the object is present.

edited May 26 '21 at 16:40

answered May 26 '21 at 16:35

Vlad from Moscow

224,104
15
141
268

I know about this. But the object is in stack, so it's different right ? If I malloc'd the variable before passing by reference, I would have no problem with the code. – Rayan Dev May 26 '21 at 16:39
@RayanDev See my updated post. – Vlad from Moscow May 26 '21 at 16:41
Using tables or pointers, I sometimes get the "Segmentation fault". Why is that ? When is that ? Thank you for the update though – Rayan Dev May 26 '21 at 16:43
1

@RayanDev Each concreate example of such behavior of your code should be examined. But this has nothing common with your current question. – Vlad from Moscow May 26 '21 at 16:46
@RayanDev Pay attention to that an object has a pointer type like for example int *p; then to change it in a function you should pass it the same way by reference through a pointer like &p. The corresponding function parameter must have the type int **. – Vlad from Moscow May 26 '21 at 16:48

score 1 · Answer 2 · answered May 26 '21 at 16:41

1

I learned that variables not using malloc are stored in stack. And we can't manage stack except in the context of execution.

It's always difficult to communicate basic concepts when one side makes up words like "context of execution" when things have proper names (closest would be "scope" in this case).

I believe the missing gap in knowledge here is that the scope of l is the scope it belongs to (ie the closest pair of braces, in this case the function main), as well as every single function's scope called from within this scope.

And this isn't an arbitrary rule, it makes sense when you consider that the stack gets expanded as you call functions, and only reduced when you exit functions. Your l is valid until the stack frame that it belongs to is no longer valid, ie until you exit main. It gets a little more complicated when you have nested scopes within your function scope, but in this case you do not.

answered May 26 '21 at 16:41

Blindy

55,135
9
81
120

Closest answer thank you ! However, I sometimes get a segmentation fault with functions called by main, but referencing a pointer to a table for example. Why is that ? Since the scope is still main. Why do I sometimes have to use malloc and sometimes not ? – Rayan Dev May 26 '21 at 16:48
I've read this https://stackoverflow.com/a/8800541/14795595 that's why I'm having trouble. – Rayan Dev May 26 '21 at 16:49
That answer is telling you the same thing, `malloc` extends the lifetime of data beyond the scope in which is created, as opposed by stack allocated data which is gone as soon as its scope is done. As to your question, I couldn't tell you without seeing the code you're referring to. – Blindy May 26 '21 at 16:54
@RayanDev That'll depend on the case, how you defined, declared, and accessing the variable. In most cases, you'll get the segfault when you try to access that memory location which isn't allocated to your program by the OS. That is you don't have permission to that memory area still you are trying to read it. – Shubham May 26 '21 at 16:54
I think you're failing to distinguish between *scope* and *lifetime*. The *scope* of `l` is the region of program text in which its name is visible. That extends from its definition to the innermost enclosing `}`. The *lifetime* of the object `l` is the time during program execution in which it exists. That begins when execution reaches the opening `{` of the `main` function and ends when execution reaches the closing `}`. The body of `inserer_point` is outside the scope of the name `l`, but the object `l` exists while `inserer_point` is executing. – Keith Thompson May 26 '21 at 17:10

Keith Thompson · Accepted Answer · 2021-05-26T18:31:32.587

The key concepts here are scope and lifetime.

Here's a simpler example:

#include <stdio.h>

void func(int *param) {
    *param = 20;
}

int main(void) {
    int n = 10;
    printf("Before, n = %d\n", n);
    func(&n);
    printf("After, n = %d\n", n);
}

We have an object n of type int defined locally in main. Its storage class is automatic, which typically means it's allocated on the stack.

The scope of the identifier n is the region of program text in which the name n is visible. It extends from the definition of n to the closing } of the main function.

The lifetime of the object named n is the period of time during program execution in which the object exists. It begins when execution enters the main function and ends when main completes.

(The lifetime of an object created by malloc extends from the successful malloc call until the object is deallocated, for example by passing its address to free, or until the program terminates. Such an object has no scope because it has no name; it can only be referred to indirectly.)

Inside the body of func, the name n is out of scope, so if I wrote n = 42; inside func I'd get a compile-time error. The name is not visible. However, while func is executing, the object named n exists, and can be referred to indirectly (though not by its name).

The object n is not read-only. If you wanted it to be, you could define it with the const keyword. You'd also have to define param as const int *param, because it's illegal to pass a pointer to a const object to a function that takes a pointer to a non-const object.

There is no reason to expect the above program (or yours, as far as I can tell) to suffer a segmentation fault, since no objects are accessed outside their lifetimes.

Passing a pointer to an object to a function so the function can modify that object is perfectly valid, and is very common.

It should be read only and can only be modified in the context of execution (in the main function).

That's just incorrect. It's not read-only, and it can be modified at any time during its lifetime. In this case, it's modified via a pointer.

UPDATE: I see you've added code that does produce a segmentation fault. Here's an abbreviated summary of the relevant part:

int *build (int a, int n) {
    int t[n];
    /* ... */
    return t;
}

t is a VLA (variable length array), defined locally in the build function. It has automatic storage duration, meaning that its lifetime is ends when build returns. The return t; statement doesn't return the array object; it returns a pointer to it. That pointer becomes a dangling pointer when the caller (main) attempts to use it. In main you have:

int *t = build (0, 15);

t points to an object that no longer exists.

Your original code did not do anything like that. Your inserer_point function returns a pointer, but it points to an object that was created in main, so it still exists when main receives the pointer to it. (And main doesn't do anything with the pointer other than assigning it to an object which is never used.)

C does not support passing arrays as parameters or returning them from functions, but a lot of the syntax makes it look like it does. Read section 6 of the comp.lang.c FAQ.

Thank you for your answer. I updated my code with something that returns a segmentation fault and that seems pretty similar to the first code for me . What's the difference ? — Rayan Dev, May 26 '21 at 17:37

score 1 · Answer 4 · answered May 26 '21 at 17:49

You seem to be confused regarding the difference between the scope and lifetime of an object.

The scope of an object designates where an object can be accessed by its declared name. For a local variable, that starts at the point it is declared until the block containing it ends, and only within that block.

The lifetime of an object designates how long the memory set aside for it is valid for. For a local variable, that starts and the beginning of the block where it is declared and ends when that block ends, and includes any functions that may be called within that block.

In your first example, l is a local variable in the main function, so its lifetime starts when main starts and ends when main returns, and is still valid when other functions are called within main. That's why you can pass &l to a function and dereference the pointer safely.

In your second example, t is an array local to the build function. Its lifetime starts when the build function is entered and ends when build returns. You then return t from the function. This actually returns a pointer to the first member of the array. So now your main function has a pointer to the first element of t, but since build returned that means the lifetime of t has ended rendering the returned pointer indeterminate, and attempting to dereference it triggers undefined behavior which in your case causes a crash.

score 0 · Answer 5 · answered May 26 '21 at 17:06

As you can see, l is declared in main without a malloc, a calloc or a realloc. Which means it is declared in stack. And we don't have control over it.

That l is declared inside main means that it has automatic storage duration and that the scope the identifier l ends at the end of main. Whether such a variable lives on the stack, or whether there even is a stack, is a detail of your C implementation. It is true, however, that you don't have control over where it is allocated.

It should be read only

No. I don't see what gives you that idea.

and can only be modified in the context of execution (in the main function).

"can be modified" is inconsistent with "read only", but of course I have already denied your assertion about the object being read only.

Now also no, nothing about the declaration of l implies that the object it identifies can be modified only by code in main. The limitation here is that the object can be accessed via its identifier only within the scope of the identifer, which is limited to main. But via its identifier, if it even has one, is not the only way to access an object.

However, we send a pointer to l in the other function as *len.

You obtain a pointer via the address-of operator: &l. Another way to access an object is via a pointer to it. C does not distinguish between objects with different storage durations in this regard (as long as objects are accessed only during their lifetimes), nor does the scope of an identifier come into it other than for obtaining a suitable pointer in the first place.

Having passed that pointer value to your function, it being received as the value of parameter len, in that function the expression *len designates the same object that l designates in main.

And then we increment len (*len++) at the bottom of the function.

Yes. No problem with that.

As I said, it should not be possible since it is not on the heap and should be read only.

No. Supposing that we stipulate a stack / heap memory arrangement, which indeed is very common, you can obtain a pointer to an object on the stack. That does not move it to the heap, nor make a copy of it on the heap. It just obtains the address of that object, wherever in memory it may be. You would probably be better off forgetting about (this kind of) stack and heap, since again, they are not C language concepts at all.

Moreover, even if you passed a pointer to an object on the heap, there is no reason to think that such an object would be read only.

But this code works and len gets incremented.

Yes.

Am I wrong about memory access ? What did I not get ?

Yes, apparently you are pretty wrong. Stack and heap storage are not C concepts. Pointers can point to any object in the program, stack / heap considerations notwithstanding. Taking the address of an object does not copy or move the object. Nothing about an object being on the heap has anything to do with whether it is read only. Neither does identifier scope.

Thank you. Something I still don't understand is when would I get a segmentation fault if I can access anything stack/heap ? — Rayan Dev, May 26 '21 at 17:14
@RayanDev, you can access (i) any *object*, (ii) *during its lifetime*, (iii) via a valid pointer to it. When you attempt to dereference a pointer under other circumstances -- especially one that was never set to point to an object or where the lifetime of the object to which it once pointed has ended -- undefined behavior occurs, and a segfault is a common manifestation in those cases. Other common mistakes that frequently manifest segfaults include attempting to modify the contents of a string literal or the value of an object declared with the `const` qualifier. — John Bollinger, May 26 '21 at 17:43

score -2 · Answer 6 · answered May 26 '21 at 16:35

-2

C doesn't enforce any memory restrictions. Some compilers may generate warnings if you define a pointer as a const, and then try to modify it but that's about it. You are free to modify the heap/stack/anything, and the language is happy to allow it (although you may get a segmentation fault).

The whole point of languages like Rust is that they provide a C-like environment that is memory safe. If you want memory safety, don't use C.

answered May 26 '21 at 16:35

Clarus

2,207
16
27

That's the problem. Why am I not getting a segmentation fault ? The code is running properly even though I don't malloc before passing by reference... – Rayan Dev May 26 '21 at 16:40
1

The C Standard allows compilers to impose severe memory restrictions in cases where doing so would be useful. Because the Standard makes no attempt to judge what is "useful", it also allows compilers to impose such restrictions whenever they see fit, and behave in any matter they see fit--no matter how silly and useless--if such restrictions are violated. – supercat May 26 '21 at 16:41
@supercat Compilers can generate warnings, the programmer can always bypass those warnings through creative coding. – Clarus May 26 '21 at 16:42
@ryan You only get a segmentation fault when you make an illegal access. If you are accessing memory that you as the user have access to, the application will be perfectly happy to do your bidding – Clarus May 26 '21 at 16:43
@Clarus That's my question though. Why do I have access to "l" outside the function when I didn't do any malloc ? – Rayan Dev May 26 '21 at 16:45
@RayanDev You can access (read/modify) your `auto` variables anywhere after they're defined and before they die. That's why you have access to the mem location of the `l` in `inserer point(.....)`. Because `l` is alive and until `return 0;` statement of the `main()` and you can access it anywhere in between. – Shubham May 26 '21 at 16:48
Downvoted for evangelising. – Cheatah May 26 '21 at 16:50
@ryan Stop thinking of it like C has memory restrictions, it doesn't. Compilers only generate warnings, which you can bypass. The OS is what provides restrictions. If you have access to memory from the OS you are good to go. – Clarus May 26 '21 at 16:51
@Cheetah: How is this evangelising? – Clarus May 26 '21 at 16:51
@Clarus: If a compiler determines that a function must receive a certain input in order to have defined behavior, it may generate code that blindly assumes it does. Given `int arr[2][2]; int test(unsigned short n) { for (int i=0; i – supercat May 26 '21 at 17:08
@supercat; Are we speaking a different language or something? – Clarus May 26 '21 at 17:24
@Clarus: Paste the above code into godbolt, select gcc for x64, specify -O2 on the command line, and see what it generates. You'll see two instructions that add 1 to a word in memory, and an instruction that xor's the EAX (which is used to hold the return value) with itself (setting it to zero). The Standard allows compilers to restrict an lvalue expression of the form arr[0][i] to accessing the first row of the array, and gcc makes no attempt to reliably process code that would use such an expression to access more of the array. – supercat May 26 '21 at 17:34
@Clarus: BTW, I think that what the clang and gcc optimizers seek to process should be recognized as a different language from the one described in K&R2, but unfortunately, when people use the term C it's often assumed that they're talking about the former language. – supercat May 26 '21 at 17:44

C, Malloc, Pointers and Context of Exectution

6 Answers6