1

How does the compiler manages memory when you pass a string lteral to a function in parameter instead of a pointer to an array of chars?

Example:

static const char myString[LENGTH] = "A string";
myFunction(myString);

and:

myFunction("A string");

Does having a static const (which will most likely be stored in ROM) passed via a pointer yields significant benefits regarding RAM usage?

When passing the string literal is it copied entirely as a local variable of sizeof(myString) or the compiler "knows" to pass it by reference since arrays are always passed by reference in C?

Shafik Yaghmour
  • 143,425
  • 33
  • 399
  • 682
Asics
  • 758
  • 1
  • 7
  • 17

6 Answers6

4

The standard does not dictate how the string literal will be used or even if the same string literal used in different parts will be shared or not. It just says they have static storage duration and modifying them is undefined behavior. Otherwise a string literal is just an array of characters and behaves accordingly.

This is covered in the draft C99 standard section 6.4.5 String literals:

In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals.66) The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char,

and:

It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

In the case of assigning to myString it will be copied to the memory allocated for myString this is covered in section 6.7.8 Initialization which says:

An array of character type may be initialized by a character string literal, optionally enclosed in braces. Successive characters of the character string literal (including the terminating null character if there is room or if the array is of unknown size) initialize the elements of the array.

Shafik Yaghmour
  • 143,425
  • 33
  • 399
  • 682
4

When passing the string literal is it copied entirely as a local variable of sizeof(myString) or the compiler "knows" to pass it by reference since arrays are always passed by reference in C?

A string literal is stored as an array such that it's available over the lifetime of the program, and is subject to the same conversion rule as any other array expression; that is, except when it is the operand of the sizeof or unary & operators or is a string literal being used to initialize an array in a declaration, an expression of type "N-element array of T" will be converted ("decay") to an expression of type "pointer to T", and the value of the expression will be the address of the first element in the array. Thus, in the calls

myFunction( mystring );

and

myFunction( "A string" );

both of the arguments are expressions of array type, neither is the operand of the sizeof or unary & operators, so in both cases the expression decays to a pointer to the first element. As far as the function call is concerned, there's absolutely no difference between the two.

So let's look at a real-word example (SLES 10, x86_64, gcc 4.1.2)

#include <stdio.h>

void myFunction( const char *str )
{
  printf( "str = %p\n", (void *) str );
  printf( "str = %s\n", str );
}

int main( void )
{
  static const char mystring[] = "A string";
  myFunction( mystring );
  myFunction( "A string" );

  return 0;
}

myFunction prints out the address and contents of both the string literal and the mystring variable. Here are the results:

[fbgo448@n9dvap997]~/prototypes/literal: gcc -o literal -std=c99 -pedantic -Wall -Werror literal.c
[fbgo448@n9dvap997]~/prototypes/literal: ./literal
str = 0x400654
str = A string
str = 0x40065d
str = A string

Both the string literal and the mystring array are being stored in the .rodata (read-only) section of the executable:

[fbgo448@n9dvap997]~/prototypes/literal: objdump -s literal
...
Contents of section .rodata:
 40063c 01000200 73747220 3d202570 0a007374  ....str = %p..st
 40064c 72203d20 25730a00 41207374 72696e67  r = %s..A string
 40065c 00412073 7472696e 6700               .A string.
...

The static keyword in the declaration of mystring tells the compiler that the memory for mystring should be set aside at program start and held until the program terminates. The const keyword says that memory should not be modifiable by the code. In this case, sticking it in the .rodata section makes perfect sense.

This means that no additional memory is allocated for mystring at runtime; it's already allocated as part of the image. In this particular case, for this particular platform, there's absolutely no difference between using one or the other.

If I don't declare mystring as static, as in

int main( void )
{
  const char mystring[] = "A string";
  ...

then we get:

str = 0x7fff2fe49110
str = A string
str = 0x400674
str = A string

meaning that only the string literal is being stored in .rodata:

Contents of section .rodata:
 40065c 01000200 73747220 3d202570 0a007374  ....str = %p..st
 40066c 72203d20 25730a00 41207374 72696e67  r = %s..A string
 40067c 00                                   .

Since it's declared local to main and not declared static, mystring is allocated with auto storage duration; in this case, that means memory will be allocated from the stack at runtime, and will be held for the duration of mystring's enclosing scope (i.e., the main function). As part of the declaration, the contents of the string literal will be copied to the array. Since it's allocated from the stack, the array is modifiable in principle, but the const keyword tells the compiler to reject any code that attempts to modify it.

John Bode
  • 106,204
  • 16
  • 103
  • 178
2

Most likely there will be no difference in memory usage.

In both cases, strings will be stored in some static location, and compiler will simply use pointer to string one way or another. For you, it will mean memory usage is exactly the same.

If you have few identical strings referenced, second case (literal strings) might be more efficient - compiler will use just one pointer. First case will have to allocate more than one memory location having exact same content.

mvp
  • 94,368
  • 12
  • 106
  • 137
2

There is no difference in how a string literal is stored in memory, regardless of whether or not you use the constant or static storage class modifiers, or if you use it as a function parameters as opposed to an intermediate variable/pointer: they are always stored in the code segment. The optimizer will also replace references to your intermediate pointer, MY_STRING, with the address to the literal itself.

Examples are shown below:

Example:                       Allocation Type:   Read/Write:  Storage Location:
============================================================================
const char* str = "Stack";     Static             Read-only    Code segment
char* str = "Stack";           Static             Read-only    Code segment
char* str = malloc(...);       Dynamic            Read-write   Heap
char str[] = "Stack";          Static             Read-write   Stack
char strGlobal[10] = "Global"; Static             Read-write   Data Segment (R/W)

References

  1. Difference between declared string and allocated string, Accessed 2014-07-31, <https://stackoverflow.com/questions/16021454/difference-between-declared-string-and-allocated-string>
Community
  • 1
  • 1
Cloud
  • 17,212
  • 12
  • 64
  • 137
0

In C, arrays are indeed passed by reference. String literals are no exception.

MSalters
  • 159,923
  • 8
  • 140
  • 320
  • 2
    Nope. The address of the array is passed by **value**. http://stackoverflow.com/questions/4774456/pass-an-array-to-a-function-by-value If you modify the argument passed to the function as a formal function parameter, you don't actually modify the argument, just a **copy** of the value. So, in short, arrays are not passed by reference. A reference to the array is passed by value. – Cloud Jul 31 '14 at 21:38
  • @Dogbert: Tomato, tomato. Passing the address by value **is** passing the array by reference. You **can** change the array values, which proves it's by reference. – MSalters Jul 31 '14 at 23:35
  • You are still wrong. http://stackoverflow.com/questions/10240161/reason-to-pass-a-pointer-by-reference-in-c Passing an address by value is not equivalent to passing an address by reference, as the former just pushes a copy of the argument on the stack, while the latter allows direct manipulation of the argument. This is lazy vernacular used in reference to C since it doesn't actually allow passing by reference like in C++. The use of "reference" is not analogous between the two languages. – Cloud Aug 01 '14 at 15:04
  • @Dogbert: Read again. I wrote "pass array by reference". Not "pass address by reference". Passing an address by reference would be `int**`. – MSalters Aug 01 '14 at 15:48
  • But the bigger issue you're missing is that you can't pass arrays by reference in C (http://stackoverflow.com/questions/2188991/what-is-useful-about-a-reference-to-array-parameter). Just a pointer to the beginning address of the array. To pass an array by reference prevents pointer decay (ie: `sizeof` still works), and such functionality does not exist in C. By passing an array as an argument, it decays to passing an address by value. The entire concept of "passing by reference" has no place in C, it is just lazy linguistics. http://stackoverflow.com/questions/1461432/what-is-array-decaying – Cloud Aug 01 '14 at 16:22
  • @Dogbert: Array passing in C was known as _pass by reference_ before C++ even existed. Array decay is just _how_ it's done. – MSalters Aug 01 '14 at 20:56
  • I haven't heard that usage of the term in 30 years, nor is there a single canonical reference to it in K&R's original books or writings. – Cloud Aug 01 '14 at 21:06
0

A static string will be stored once in RAM. A string literal will be stored once in RAM. When passed to a function, either the static location will be loaded and passed as the argument or the literal location will be passed. There is no difference as far as storage goes.

However, keeping your literal strings together in once place and referencing them will be far more maintainable than spreading string literals through your code.

eholley
  • 1
  • 2