6

I've read the followed post:

Is returning a heap-allocated pointer from function OK?

Which shows that a pointer pointing to a heap allocated variable is returned is alright. However, is the pointer technically a "stack allocated variable", which would then get deallocated upon returning of the function?

For example:

int* test(){
  int arr[5];
  int *ptr = arr;

  return ptr; //deallocated ptr?
}

int *test2(){
  int arr[5];

  return arr;
}

In test

Also, is it right to say arr is a pointer that points to some newly created int array arr, pointing at &arr[0]. If arr is not a pointer, why is it valid to return it satisfying the function return type?

Since both ptr and arr are supposedly stack allocated, why does the code only work in test() and not test2()? Does test() give an undefined behavior?

  • 4
    They're both undefined behavior. And as it is with UB, it might sometimes "work" as expected, and not work at other times. – Blaze Apr 09 '19 at 12:14
  • 3
    "Is it right to say arr is a pointer" No! An array is *not* a pointer. But it can *decay* to a pointer to its first element. So if you use `arr` in a context where a pointer is expected, the compiler will translate it into `&arr[0]`. – Some programmer dude Apr 09 '19 at 12:14
  • 1
    The fact that the compiler does not complain about it does not mean it is correct. – Paul Ogilvie Apr 09 '19 at 12:16
  • The memory of arr[5] is available again for the compiler after you returned from that function. So there's no use of having a pointer to something that will be overwritten soon -> don't do this. – Axel Podehl Apr 09 '19 at 12:16
  • "allowed in C" well this is low level so (allmost)everythink is allow. But this produce a warning, use advaced IDE like eclipse / clion and enable compilation all warnings by flag `-Wall` (if compile with gcc). – Igor Galczak Apr 09 '19 at 12:16
  • By the way, a good compiler with the right flags set, would be able to warn about both the examples you show. – Some programmer dude Apr 09 '19 at 12:18
  • You allocated data in stack. Heap is allocate only using functions like `malloc, calloc, realloc`. But address to allocated memory is store in stack. – Igor Galczak Apr 09 '19 at 12:18
  • It is not allowed. The question is invalid. – n. 'pronouns' m. Apr 09 '19 at 12:19
  • If arr is not a pointer, why does it work like a pointer when doing array access? e.g. `arr[5]` behaves more like some `ptr[5]` rather than `&arr[0][5]` which is syntactically incorrect. Does a pointer to a variable can automatically created every time a variable is instantiated? – oldselflearner1959 Apr 09 '19 at 12:30
  • @PaulOgilvie Modern compilers will complain about both functions unless you silence them (by setting an insufficient warning level). – Peter - Reinstate Monica Apr 09 '19 at 12:49
  • @oldselflearner1959 `&arr[0][5]` is, arguably, *syntactically* correct; it's just that it is evaluated `&((arr[0])[5])`, i.e. tries to index `arr[0]` which is neither an array nor a pointer, so you dont get a grammatical but a *type* error. You *can* very well say `(&arr[0])[5]` if you want to get heat in a code review ;-). – Peter - Reinstate Monica Apr 09 '19 at 12:53
  • @PeterA.Schneider...and I wish that the warning level is set to max by default so many posts here are not necessary because the compiler caught the errors already... – Paul Ogilvie Apr 09 '19 at 13:11
  • An array *decays* to a pointer in certain contexts, but an array *isn't* a pointer. – n. 'pronouns' m. Apr 10 '19 at 05:00

5 Answers5

6

They will both be undefined behaviour, if the returned value is accessed. So, none of them are "OK".

You're trying to return a pointer to a block-scoped variable which is of auto storage duration. So, once the scope ends, the lifetime of the variable comes to an end.

Quoting C11, chapter §6.2.4/P2, regarding the lifetime (emphasis mine)

The lifetime of an object is the portion of program execution during which storage is guaranteed to be reserved for it. An object exists, has a constant address, and retains its last-stored value throughout its lifetime. If an object is referred to outside of its lifetime, the behavior is undefined [...]

Then, from P5,

An object whose identifier is declared with no linkage and without the storage-class specifier static has automatic storage duration, [...]

and

For such an object that does not have a variable length array type, its lifetime extends from entry into the block with which it is associated until execution of that block ends in any way. [...]

So, in your case, the variable arr is having automatic storage and it's lifetime is limited to the function body. Once the address is returned to caller, attempt to access the memory at that address would be UB.

Oh, and there's no "stack" or "heap" in C standard, All we have is the lifetime of a variable.

Sourav Ghosh
  • 127,934
  • 16
  • 167
  • 234
  • The C standard is about the only place where there is no stack or heap ;-). – Peter - Reinstate Monica Apr 09 '19 at 12:16
  • 1
    So we should rename this site to "thingyOverflow.com" :-) – Paul Ogilvie Apr 09 '19 at 12:17
  • 1
    @PeterA.Schneider and, so? – Sourav Ghosh Apr 09 '19 at 12:19
  • @PaulOgilvie I don;t believe I got a say in that matter. :) – Sourav Ghosh Apr 09 '19 at 12:19
  • Can you explain more on why there's no stack or heap? There are quite a few texts I saw saying there are regions in C memory (e.g. https://www.geeksforgeeks.org/memory-layout-of-c-program/), so I'm a bit confused by that. If there's only lifetime, does it mean all variables are technically on some universal memory with some TTL? – oldselflearner1959 Apr 09 '19 at 12:27
  • @oldselflearner1959 Because the C standard does not mention implementation details. Also, I added some more clarification in my answer to hep you understand better. – Sourav Ghosh Apr 09 '19 at 12:32
  • Also, what is a good design pattern to follow if we want to return a new pointer in a function? because currently it seems to me any function that returns a pointer variable would technically return a stack deallocated pointer if it was created within the function. Should we always create the pointer before using a function? – oldselflearner1959 Apr 09 '19 at 12:33
  • @oldselflearner1959 No, you can certainly return a pointer to a variable which has `static` storage duration. Example, a `static char arr[5]` can be returned with `return arr;` and can be used in caller. Also, memory allocator functions provides you with memory with lifetime unless they are deallocated with `free()`. – Sourav Ghosh Apr 09 '19 at 12:35
  • This is very interesting. I just tried out static and the warning goes away. In this case, what would work best: A) returning a pointer to static char (ptr probably gets deallocated again) B) returning static char `arr`? In B, why does returning arr works when arr is supposedly not a pointer and the function requires us to return an `int*`? – oldselflearner1959 Apr 09 '19 at 12:44
  • @oldselflearner1959 No, technically they are in a register or on the stack or the heap or in ROM and/or in some place within the executable, or wherever. The C standard does not mandate any of these because these are implementation details; it gives implementers the freedom to use what they see fit. But all mainstream implementations since Algol60 use a stack, technically, because it is such an easy way to implement reentrant functions and other nested blocks. Mainstream CPUs even have hardware support for it (a stack pointer), and the C ABI defines how it's handled. – Peter - Reinstate Monica Apr 09 '19 at 12:45
  • @oldselflearner1959 `when arr is supposedly not a pointer`..surprise..it is!! In many of the cases, array type boils down to the pointer to the first element type. – Sourav Ghosh Apr 09 '19 at 12:46
  • @SouravGhosh That makes sense, but why do so many people keep saying arr is not a pointer? See someprogrammerdude's comments above saying it's not a pointer. What does 'decaying' to a pointer really mean? – oldselflearner1959 Apr 09 '19 at 12:51
  • @oldselflearner1959 and I say the same, `arr` **is not** a pointer. At times, it decays to a pointer, but that does not mean it is a pointer. – Sourav Ghosh Apr 09 '19 at 12:54
  • @oldselflearner1959 You can check more here: [What is array decaying?](https://stackoverflow.com/q/1461432/2173917) – Sourav Ghosh Apr 09 '19 at 12:55
  • @oldselflearner1959 *what is a good design pattern* You mean [Maslow's patterns](https://en.wikipedia.org/wiki/Law_of_the_instrument)? There's no such thing as a "good design pattern". Learn how to solve actual problems, not how to force-fit them into preconceived patterns. – Andrew Henle Apr 09 '19 at 13:30
  • Technically, we do not know what will happen if “the returned value is accessed”—we do not know that the resulting behavior will not be defined by the C standard. When the pointed-to object ceases to exist, the value of a pointer to it becomes *indeterminate*. This means it is either a trap representation or an unspecified value. For the latter, we do not know it points to the previous memory anymore. It could happen to point to some valid object! A fine distinction, as the programmer cannot rely on anything, but it is not what the C standard defines as undefined behavior. – Eric Postpischil Apr 09 '19 at 13:33
2

Both test and test2() are equivalent. They return an implementation-defined pointer that you must not dereference, or else UB ensues.

If you don't dereference the returned pointer, calling test() or test2() does not result in undefined behavior, but such a function is probably not very useful.

PSkocik
  • 52,186
  • 6
  • 79
  • 122
  • 2
    Dereferencing the returned pointer is not the only hazard; merely using it could be a problem. Per the C standard, when an object ceases to exist, the value of a pointer to it becomes *indeterminate*. This means it could be a trap representation or an unspecified value. – Eric Postpischil Apr 09 '19 at 13:36
1

Upon entering a function a new stack frame is added to the stack. The stack frame is where all autos (non static variables declared in the function) are stored. When we leave the function the return value is placed in a register (generally R0) in the CPU and the stack pointer is then decreased to remove the stack frame. We then return control to the point where we called the function and we get the return value from the register.

So in this case you have int arr[5], as the program enters the function a new stack frame is added to the stack. In this stack frame there is memory for 5 integers in an array, the variable arr is indeed now equivalent a pointer to the first element in the array. When you return the variable arr you are returning a pointer to the data in the stack frame, when the function exits and you return back to the previous function the stack pointer is then decreased to remove the stack frame of the function you just exited.

The pointer is still pointing to that place in memory where we previously had an array allocated. So when the stack is increased the memory arr is pointing to will be over written. Changing the data the returned value points to could result in some very "exciting" stuff happening as we don't know when the memory is now used for.

Array vs pointer example:

char arr[5];
char * ptr = arr;

In this case the compiler knows the size of arr and does not know the size of ptr so we can do sizeof(arr) and the compiler will do the calculation at compile time. When it comes to run time, they are equivalent values in memory.

liamcomp
  • 366
  • 1
  • 8
  • Thanks for the reply. However, isn't the pointer technically a stack allocated variable that will disappear after the function return? So how does the arr still point to the same region of memory after returning? – oldselflearner1959 Apr 09 '19 at 12:55
  • A pointer is really just a number, the number is used to specify a location in memory. So when you return the pointer, the value of the pointer (a number referring to a memory location) is copied and given to the function that called it, same as returning any other stack allocated variable. The returned value is still pointing to the same memory location as before even though we don't know what the memory is now being used for. – liamcomp Apr 09 '19 at 13:07
  • The value of arr has not changed but the value of what array is pointing is now unallocated. You could try printing arr as an integer in the function and the value returned by function, it will should be the same in both cases. A large (64-bit if you have a 64-bit computer) number. `printf("%lu", (long) arr)` – liamcomp Apr 09 '19 at 13:11
  • C semantics should not be explained in terms of a hardware stack or specific registers. No, results are not generally returned in `R0`—how results are returned varies by ABI, and some processors either do not have a register designated `R0` or `R0` is a special register not suitable for such use. Furthermore, per the C standard, a pointer is not “still point to that place in memory.” Once an object ceases to exist, the value of a pointer to it becomes *indeterminate* (C 2018 6.2.4 2). – Eric Postpischil Apr 09 '19 at 13:44
  • Sorry my background is in embedded systems, specifically cortex M series processors, Ill take your comments into account in future responses and pay attention to what is hardware specific and what is C standard general – liamcomp Apr 09 '19 at 13:51
0

Both cases are technically the same.

In both cases, a pointer to arr is returned. While the value of the returned pointer indeed points to the memory that used to contain arr, arr is already freed from the memory.

Therefore, sometime when you access the pointer you would still find there the contents of arr, which were just happened to not overridden yet. Other times, you might access it after this memory has been overridden, and get undefined data or even segmentation fault.

Bartolinio
  • 730
  • 2
  • 8
  • 16
  • "*a pointer to arr is returned*" well, not exactly. A pointer the `arr`'s 1st element is return, which type wise is not the same as a "*pointer to `arr`*". – alk Apr 09 '19 at 12:29
  • Re: “While the value of the returned pointer indeed points to the memory that used to contain arr, arr is already freed from the memory.”: Per C 2018 6.2.4 2, “The value of a pointer becomes indeterminate when the object it points to (or just past) reaches the end of its lifetime.” – Eric Postpischil Apr 09 '19 at 13:45
0

You still seem residually confused by the pointer being an automatic variable as well, so that you are afraid that returning it would be invalid even if it pointed to some valid memory (say, a static array).

It is important to remember that in C all parameter and return value passing is done by value. If you "return a pointer" as in return p; it is exactly the same mechanism as if you "returned an integer", as in return i;: The value of the variable is copied somewhere and obtained by the caller. In the case of i that value may be 42; in the case of p the value may be 3735928559 (or in other words, 0xdeadbeef). That value denotes the place in memory where e.g. your array was residing before it ceased to exist because the function returned. The address does not change when you copy it any more than 42 changes, and is completely independent of the lifetime of the variable p which once contained it — it was, after all, copied out of it just in time.1


1This is beyond the scope of the question but technically conceptually, a temporary object is created for the return value. The lifetime and semantics of temporaries are more systematically categorized in modern C++.

Peter - Reinstate Monica
  • 12,309
  • 2
  • 29
  • 52
  • No, technically, no temporary object is created for the return value. C does not have such temporary objects. Values are their own things, independent of objects, and there is no notion in the C standard for treating the thing being returned as a temporary object. Also, once the pointed-to object ceases to exist, the value of the pointer becomes indeterminate; it does not necessarily point to the memory it once did. – Eric Postpischil Apr 09 '19 at 13:39
  • @EricPostpischil Hmmm... interesting. I'd still maintain that conceptually a temporary object is returned; it's just that they are all trivially copyable in C, so that is what the compiler does. But in the sense that you cannot take an address of it (even though the object may have one, if it is in RAM) it is not an object, true. – Peter - Reinstate Monica Apr 09 '19 at 14:19