How are values copied when passed as arguments?

Question

So when passing arguments by value, the value(s) is copied to the function’s parameter(s), but how does it work? Are the parameters just declared as regular variables and assigned the values passed as arguments? Like this:

int getx(int z, int x)
{
int a = z+x;
return a;
}

int main()
{
int q = 2;
int w = 55;
int xx = getx(w, 2);
return 0;
}

If so, why would you call this to copy the value? Isn’t the parameter variable just assigned the value of the variable x? What is copied then?

Um, the value? Eg. z in the function is a copy of w outside, that means eg. changing z won't change w too. — deviantfan, May 11 '17 at 23:31
`int` is easy. A new `int` is made in Automatic storage and assigned the value of the passed parameter. Things get tricker with objects. Familiarize yourself with the [Rules of Three, Five, and Zero](http://en.cppreference.com/w/cpp/language/rule_of_three) and [assignment operator overloading](http://stackoverflow.com/questions/3279543/what-is-the-copy-and-swap-idiom). Copy and swap link used because it covers the topic well in addition to providing an elegant solution. — user4581301, May 11 '17 at 23:33
I like user4581301 comment about the Rules of Three, Five & Zero, however there is another way to look at the semantics of the compiler for function argument or parameter list and that is by drawing out a "stack frame table" to represent current scope visibility as I have demonstrated in my answer below. — Francis Cugler, May 12 '17 at 04:53

MBurnham · Answer 1 · 2017-05-11T23:57:16.033

1

Short and fun answer: You should think of variables as boxes that hold a toy. if a function takes a parameter by vaule and you call the function, you are just telling the function what toy you have in your box. It goes and gets its own toy that is exactly like the one you have (but not your toy), and plays with it. So you don't care what it does to the toy inside the function, cause it isn't your actual toy. When you pass by reference, you are actually giving the function the toy out of your own box, and it plays with your toy instead of getting its own.

A longer more indepth-ish answer: It means that when you call int xx = getx(w, 2); in your main function inside of your getx function you are going to use a chunk of data that has the same bits in it as the chunks of data you passed. but they are not the same chunk of data. this means that z is just a copy of the info that was in w when you called the function.

Assume you wrote getx like this (where z is passed in by value)

int getx(int z, int x) {
    int a = z + x;
    z = z + 1;
    return a;
}

In this case, after you call this function inside of main (getx(w, 2)) a copy of w went in as 55, and w comes out as 55

In contrast, if you were to have this:

int getx(int& z, int x) {
    int a = z + x;
    z = z + 1;
    return a;
}

and then called it in the like you do in your main

int main()
{
    int q = 2;
    int w = 55;
    int xx = getx(w, 2);
    return 0;
}

In this case, z is passed in by reference (notice the use of int& instead of int). Which means, you are not using a copy of w's data, you will actually use the real w's data inside of getx. so in this case in your main function, after you call getx(w, 2) w (not a copy) goes in as 55, so w comes out as 56

P.S. Passing by reference is, in my opinion, usually bad practice. You can't tell just by reading getx(w, 2) that w is going to come out differently than it went in.

edited May 11 '17 at 23:57

answered May 11 '17 at 23:48

MBurnham

371
1
9

Passing by reference is not bad practice; there are many cases where you need to pass by reference. 1st you need a function to return more than one value that are not related where you won't see them together in a common struct or class. 2nd efficiency of both memory and speed overhead. 3rd when you are passing strings, containers, structs and classes, it is usually better to pass by reference instead of copies. Imagine have a vector of 1,000,000 class objects where that class object for each instance takes up 256bytes. Pass by value and you duplicated 256MBs for a single function call... – Francis Cugler May 12 '17 at 00:05
... continued: now have that same function call that takes a vector of that class or struct and run it through a loop 100,000 times. Now you will see the overhead and memory consumption that leads to bottlenecks and application hangs. Passing by reference eliminates 90% of all that unneeded bulk overhead. – Francis Cugler May 12 '17 at 00:06
1

However in many cases it would be advisable to pass by `const reference` as opposed to `reference` unless if you need the function to modify the variable that is passed in beside its return value. – Francis Cugler May 12 '17 at 01:56
@FrancisCugler You can return multiple values using a `std::tuple` ; using a reference parameter for a "return value" is a hack that's rarely necessary these days – M.M May 12 '17 at 05:39
@M.M. true enough; but what if you are still limited to using only "C" instead of "C++" ? One should still know how to use them. – Francis Cugler May 12 '17 at 05:52
Those are different languages , you could extend the argument "what will someone do if they can't use language A and have to use language B" to almost any two languages – M.M May 12 '17 at 06:38
@FranciesCugler In my experience, if you need a function to return multiple values, then either A. those values are a package deal, in which case they probably belong in a class/struct (which would make them a single value.) or B your function is breaking the principle of single responsiblity and is doing too much. Having a function that takes a param be reference immediately couples your code that is calling the function to the function that is being called. – MBurnham May 17 '17 at 16:13
Please note: I am disregarding cases in which in which performance is the only priority. I am only talking about writing good, easily maintainable code, not the fastest code possible at all costs. – MBurnham May 17 '17 at 17:17

Francis Cugler · Answer 2 · 2017-05-12T05:02:18.187

In order to understand how functions work with their parameter list either passing by value or by reference one of the things you should try to do is create or simulate a table that would represent a stack frame for each line of code that is executed and for each new code block or scope you will need a new stack frame table.

Check out this short program & diagram:

main.cpp

#include <iostream>

int someCalculationByValue( int a, int b ) {
   a *= 2 + b;
   return a;         
}

int someCalculationByReference( int& a, int& b ) {
   a *= 2 + b;
   return a;
}

int main() {

    int x = 3;
    int y = 4;

    std::cout << "Starting with x = " << x << " and y = " << y << std::endl;
    int ansValue = someCalculationByValue( x, y );
    std::cout << "After Calculation" << std::endl;
    std::cout << "x = " << x << " y = " << y << std::endl;
    std::cout << "ansValue = " << ansValue << std::endl << std::endl;

    std::cout << "Starting with x = " << x << " and y = " << y << std::endl;
    int ansReference = someCalculationByReference( x, y );
    std::cout << "After Calculation" << std::endl;
    std::cout << "x = " << x << " y = " << y << std::endl;
    std::cout << "ansReference = " << ansReference << std::endl << std::endl;

    return 0;
}

Table - I will skip over the std::cout calls only showing the local variables and the user defined functions for simplicity.

// Stack Frame Staring With First line of execution in the main function

// Stack Frame Start - Visible Scope Resolution of Main
{x} - Type int : 4bytes on 32bit - Scope Visibility - main()
+--+--+--+--+    
|  |  |  |  |    // On 32bit system 4bytes of memory - each block is 1 byte
+--+--+--+--+

{y} - Type int : 4bytes on 32bit - Scope Visibility - main()
+--+--+--+--+
|  |  |  |  |
+--+--+--+--+

{ansValue} - Type int : 4bytes on 32bit - Scope Visibility - main()
+--+--+--+--+
|  |  |  |  |
+--+--+--+--+

{=} assignment or evaluation

    {someCalculationByValue} - Function Call - New Scope - New Stack Frame

    {return value} - Type int 4bytes on 32bit - Returns back to calling function
    In this case it returns back to main and flows into assignment which then
    gets stored into {ansValue}
    +--+--+--+--+
    |  |  |  |  |  // Normally Has no name look up but is expected to be
    +--+--+--+--+  // returned by this function when it goes out of scope

    {a} - Type int - 4bytes on 32bit system - Parameter List - Visibility
    +--+--+--+--+    is local to this function only - it copies the value
    |  |  |  |  |    that is already stored in the variable that was passed
    +--+--+--+--+    to it or by direct value

    {b} - Type int - 4bytes on 32bit system - Parameter List - Visibility
    +--+--+--+--+    is local to this function only - it copies the value
    |  |  |  |  |    that is already stored in the variable that was passed
    +--+--+--+--+    to it or by direct value

    {a} an L-Value followed by compound assignment 
    {*=} a compound assignment followed by arithmetic operation or expression
         R-Value {a} = {a} * (2 + {b})
    {return} Return statement - return back to caller in this case main() which
             flows into the previous assignment in main that stores this return 
             value in {ansValue}

// Scope Resolution is now back in main()
{ansReference} - Type int : 4bytes on 32bit - Scope Visilbity - main()
+--+--+--+--+
|  |  |  |  |
+--+--+--+--+

{=} assignment or evaluation

    {someCalculationByReference} - Function Call - New Scope - New Stack Frame
    {return value} - Type int 4bytes on 32bit - Returns back to calling function
    In this case it returns back to main and flows into assignment which then
    gets stored into {ansReference}
    +--+--+--+--+
    |  |  |  |  |  // Normally Has no name look up but is expected to be
    +--+--+--+--+  // returned by this function when it goes out of scope

    // No Local Variables Declared - Uses the actual variables that are passed 
    // in by the caller as this does substitution from its declarative variables
    {a} - the actual variable passed in  followed by compound assignment 
    {*=} followed by arithmetic operation or expression {a} = {a} * (2 + {b})
         However since this is by reference the direct use of main's variables 
         are used so this then becomes: {x} = {x} * (2 + {y})
    {return} - back to caller in this case main() which flows into the previous
               assignment in main that stores this return value in {ansReference}

// Scope Resolution is now back in main()

Now lets perform the actual function calls as to what the compiler is doing under the hood for each of these function calls.

someCalculationByValue()

x & y are passed by value from main's local scope variables
x has value of 3
y has value of 4

// Since passing by value
a is assigned a value of what x has which is 3
b is assigned a value of what y has which is 4

The arithmetic compound assignment and expression with substitution
{a(3)} *= 2 + {b(4)}
{a(3)}  = {a(3)} * (2 + {b(4)})
{a}     = (18)
return {a(18)} -> 18 is returned back and saved into main's {ansValue}

In the main function after the calculation we print x & y to the console x still has a value of 3 and y a value of 4; nothing has changed with main's x & y values.

someCalculationByReference()

x & y are passed by reference from main's local scope variables
x has value of 3
y has value of 4

// Since passing by reference
a is replaced with x that has a value of 3
b is replaced with y that has a value of 4

The arithmetic compound assignment and expression with substitution
Since by reference this function has no local variables of (a & b) it 
uses direct substitution of the variables that are passed in by its caller:
in this case; the main function.
{x(3)} *= 2 + {y(4)}
{x(3)}  = {x(3)} * (2 + {y(4)})
{x}     = (18)
return {x(18)} -> 18 is returned back and saved into main's {ansReference}

This time we print main's local stack variables x & y but this time x is no longer 3, it is now 18 the same as the return value since it was modified within this function as a reference, the same would happen to y too since it is also a reference but we did not modify it within the second function so its value remains the same. And there you have it; the difference of the workings of passing by value (copied) or passing by reference (direct substitution).

score 0 · Answer 3 · answered May 12 '17 at 07:28

It is a bit difficult to puzzle out what your "sample" code is supposed to be showing happening. Part of that is because the underlying semantics of parameter passing do not really map well into code that can be shown. This is all behind-the-scenes details and can't really be expressed in the language proper in any way more explicit than it normally is.

I'm also not really sure from your brief explanation exactly what your mental model is of argument passing, so I'm not sure where to start in clarifying it. So let's just start from the beginning.

When a function accepts a parameter "by value", the caller of that function makes a new copy of the passed object, and passes it to the function. The function then consumes that copy, doing whatever it wants to it. When that function ends, that copy is effectively thrown away. This leaves the caller with its original object.

When a function accepts a parameter "by reference", the caller of that function actually passes its own copy of the object to the function. The function then consumes that copy, doing whatever it wants to it. When that function ends, any changes that it has made to the object are permanent, since it's the same object, and those changes are reflected at the caller site. In other words, there is no copy made.

Pass-by-value is actually the way everything works in C. When you do:

void Function(int foo);

the parameter foo is passed by value. Similarly, when you do:

void Function(int * foo);

the parameter foo is still passed by value; it's just that the parameter being passed by value is actually a pointer, so this simulates passing "by reference", because you're passing a reference to the original value in memory via a pointer indirection.

In C++, you actually have true pass-by-reference semantics because the language has first-class reference types. So when you do:

void Function(int & foo);

the parameter foo is actually passed by reference—Function gets a reference to the original object that the caller has. Now, behind the scenes, C++ is going to implement references via pointers, so there really isn't anything new going on. You just get a guarantee from the language that there will never be a "null" reference created, which saves you from a whole category of bugs.

I believe that one's understanding of these details can be enhanced by looking at how this is actually implemented under the hood by a compiler. The implementation details vary across implementations and architectures, but in general, there are two basic ways that parameters can be passed to functions: either on the stack, or in the processor's internal registers.

If a parameter is being passed on the stack, then the caller "pushes" the value onto the stack. The function is then called, and it reads/uses the data from the stack. After the function is complete, the parameter is "popped" off the stack. In a pseudo-assembly language:

PROCEDURE getx                    // int getx(int one, int two)
    LOAD    reg1, [stack_slot1]   // load parameter from stack slot #1 into reg1
    LOAD    reg2, [stack_slot2]   // load parameter from stack slot #2 into reg2
    ADD     reg1, reg2            // add parameters (reg1 += reg2)
    RETURN  reg1                  // return result, in reg1
END


PROCEDURE main                    // int main()
    PUSH   2                      // push parameter 1 onto stack
    PUSH   55                     // push parameter 2 onto stack
    CALL   getx                   // call function 'getx'

    // The function has returned its result in reg1, so we can use it
    // if we want, or ignore it.

    POP    stack_slot1            // pop parameters from stack to clean up stack
    POP    stack_slot2
    RETURN 0
END

Here, we have "pushed" constant values onto the stack. However, we could just as easily have pushed a copy of a value in a register.

Notice that the "push" makes a copy of the value, so passing via stack is always going to be pass-by-value, but as we have said, a copy of a pointer can be passed in order to give pass-by-reference semantics. Any changes made to the object through the pointer will be reflected in the callee.

If a parameter is being passed in a register, then the caller must ensure that the value is loaded into the appropriate register. The function is then called, and it reads/uses the data from that register. After the function is complete, any changes it made to the value in the register are still visible. For example:

PROCEDURE getx                    // int getx(int one, int two)
    ADD     reg1, reg2            // add parameters (reg1 += reg2)
    RETURN                        // result is left in reg1
END


PROCEDURE main                    // int main()
    MOVE   reg1, 2                // put '2' in reg1
    MOVE   reg2, 55               // put '55' in reg2
    CALL   getx                   // call function 'getx'

    // The function has modified one or both registers, so we can use
    // those values here, or ignore them.

    RETURN 0
END

If main is doing something else with the values before or after the function call, then it can do that in the exact same registers that getx uses for its parameters. This would basically be pass-by-reference semantics. Or, it can get pass-by-value semantics by copying the values into new registers first, calling getx, and then copying the result(s) back out.

How are values copied when passed as arguments?

3 Answers3