474

I'm trying to understand rvalue references and move semantics of C++11.

What is the difference between these examples, and which of them is going to do no vector copy?

First example

std::vector<int> return_vector(void)
{
    std::vector<int> tmp {1,2,3,4,5};
    return tmp;
}

std::vector<int> &&rval_ref = return_vector();

Second example

std::vector<int>&& return_vector(void)
{
    std::vector<int> tmp {1,2,3,4,5};
    return std::move(tmp);
}

std::vector<int> &&rval_ref = return_vector();

Third example

std::vector<int> return_vector(void)
{
    std::vector<int> tmp {1,2,3,4,5};
    return std::move(tmp);
}

std::vector<int> &&rval_ref = return_vector();
Community
  • 1
  • 1
Tarantula
  • 16,620
  • 11
  • 48
  • 70
  • 58
    Please do not return local variables by reference, ever. An rvalue reference is still a reference. – fredoverflow Feb 13 '11 at 20:36
  • 71
    That was obviously intentional in order to understand the semantic differences between examples lol – Tarantula Feb 15 '11 at 00:22
  • @FredOverflow Old question, but it took me a second to understand your comment. I think the question with #2 was whether `std::move()` created a persistent "copy." – 3Dave Dec 22 '13 at 23:17
  • 6
    @DavidLively `std::move(expression)` doesn't create anything, it simply casts the expression to an xvalue. No objects are copied or moved in the process of evaluating `std::move(expression)`. – fredoverflow Dec 23 '13 at 08:22

6 Answers6

607

First example

std::vector<int> return_vector(void)
{
    std::vector<int> tmp {1,2,3,4,5};
    return tmp;
}

std::vector<int> &&rval_ref = return_vector();

The first example returns a temporary which is caught by rval_ref. That temporary will have its life extended beyond the rval_ref definition and you can use it as if you had caught it by value. This is very similar to the following:

const std::vector<int>& rval_ref = return_vector();

except that in my rewrite you obviously can't use rval_ref in a non-const manner.

Second example

std::vector<int>&& return_vector(void)
{
    std::vector<int> tmp {1,2,3,4,5};
    return std::move(tmp);
}

std::vector<int> &&rval_ref = return_vector();

In the second example you have created a run time error. rval_ref now holds a reference to the destructed tmp inside the function. With any luck, this code would immediately crash.

Third example

std::vector<int> return_vector(void)
{
    std::vector<int> tmp {1,2,3,4,5};
    return std::move(tmp);
}

std::vector<int> &&rval_ref = return_vector();

Your third example is roughly equivalent to your first. The std::move on tmp is unnecessary and can actually be a performance pessimization as it will inhibit return value optimization.

The best way to code what you're doing is:

Best practice

std::vector<int> return_vector(void)
{
    std::vector<int> tmp {1,2,3,4,5};
    return tmp;
}

std::vector<int> rval_ref = return_vector();

I.e. just as you would in C++03. tmp is implicitly treated as an rvalue in the return statement. It will either be returned via return-value-optimization (no copy, no move), or if the compiler decides it can not perform RVO, then it will use vector's move constructor to do the return. Only if RVO is not performed, and if the returned type did not have a move constructor would the copy constructor be used for the return.

Community
  • 1
  • 1
Howard Hinnant
  • 179,402
  • 46
  • 391
  • 527
  • So, from what I gather the best thing to do is for objects to have a move constructor. I probably should just google this, but I'm being lazy at the moment; are there any common guidelines for compilers on RVO? – Keith Feb 25 '13 at 16:21
  • 71
    Compilers will RVO when you return a local object by value, and the type of the local and the return of the function are the same, and neither is cv-qualified (don't return const types). Stay away from returning with the condition (:?) statement as it can inhibit RVO. Don't wrap the local in some other function that returns a reference to the local. Just `return my_local;`. Multiple return statements are ok and will not inhibit RVO. – Howard Hinnant Feb 25 '13 at 20:18
  • 29
    There is a caveat: when returning a _member_ of a local object, the move must be explicit. – boycy Feb 26 '13 at 08:58
  • 1
    hi, can you elaborate on this: " rval_ref now holds a reference to the destructed tmp inside the function. " Do you mean temporary created in the return line, or func local variable named tmp. – NoSenseEtAl Feb 27 '13 at 10:36
  • 5
    @NoSenseEtAl: There is no temporary created on the return line. `move` doesn't create a temporary. It casts an lvalue to an xvalue, making no copies, creating nothing, destroying nothing. That example is the exact same situation as if you returned by lvalue-reference and removed the `move` from the return line: Either way you've got a dangling reference to a local variable inside the function and which has been destructed. – Howard Hinnant Feb 27 '13 at 16:11
  • 3
    Just a nit: Since you *named* the variable (`tmp`) in the "Best practice" section, it is the NRVO that kicks in, not the RVO. These are two different optimizations. Other than that, great answer! – Daniel Frey Feb 28 '13 at 10:19
  • @HowardHinnant, why result type cannot have `const` for RVO? – greenoldman Mar 15 '14 at 20:17
  • 3
    @greenoldman: I was mistaken. RVO can work with const return types. It is just a bad idea to do so. If the RVO fails, move semantics will not kick in. – Howard Hinnant Mar 16 '14 at 13:46
  • 15
    "Multiple return statements are ok and will not inhibit RVO": Only if they return *the same* variable. – Deduplicator Jul 29 '14 at 22:17
  • 5
    @Deduplicator: You are correct. I was not speaking as accurately as I intended. I meant that multiple return statements do not forbid the compiler from RVO (even though it does make it impossible to implement), and therefore the return expression is still considered an rvalue. – Howard Hinnant Jul 29 '14 at 22:21
  • 1
    In all of this we are talking about the RETURN operation possibly being an implicit move. In the case of an implicit move, how is the subsequent assignment affected? The return value of the function is *by-value*, not an rvalue reference, so how will std::vector know to use a move for the construction of the local variable at the call site? – void.pointer Aug 15 '14 at 17:12
  • 2
    @RobertDailey: The expression `return_vector()` is an rvalue, since the function is returning an object by value. When that expression is used to construct an object at the call site, overload resolution will choose a move constructor if it exists. If the object is already constructed, then overload resolution will instead choose an assignment operator. Since the rhs is an rvalue, it will choose the move assignment operator if it exists. – Howard Hinnant Aug 15 '14 at 18:26
  • I don't understand the "if the compiler decides"... I DECIDE, I'M the programmer. Why should it do something I didn't tell it to do? – gedamial Mar 18 '16 at 19:43
  • 1
    @gedamial: The C++ standard says that the compiler writers get to make some of the decisions. One of those is RVO. Under a very specific set of circumstances, the compiler is allowed but not required to perform RVO. And the compiler, not you, gets to decide whether or not it performs RVO. There *has* been some talk about requiring RVO, but at this time, that has not been standardized. – Howard Hinnant Mar 18 '16 at 20:38
  • Actually I'm having troubles with NRVO/RVO, **please** see http://stackoverflow.com/questions/35506708/move-constructor-vs-copy-elision-which-one-gets-called and/or http://www.cplusplus.com/forum/general/187009/ – gedamial Mar 18 '16 at 21:07
  • @gedamial: Ok, I looked at the SO question. It looks like you answered it yourself, and I don't have anything to add to your answer. – Howard Hinnant Mar 18 '16 at 21:49
  • @HowardHinnant I don't think my answer is correct: I heard that copy elision **CAN** occur even when there are more than 1 return statements. I'd like to know when a copy elision is forbidden – gedamial Mar 18 '16 at 21:51
  • 2
    @gedamial: Ok, I've given it a shot. – Howard Hinnant Mar 18 '16 at 22:08
  • @HowardHinnant Strange question but why do standard library components return by rvalue reference and not by value? Is it to be more efficient in the case where the value is not required to be moved from and is just discarded after the fetch? I ask because this case causes undefined behavior https://wandbox.org/permlink/kUqjfOWWRP6N57eS – Curious Sep 24 '17 at 16:29
  • @Curious: I can't think of a good reason to return a reference to an rvalue. I made this same mistake in 2005 when proposing rvalue-overloads for `string+string` but fortunately corrected it prior to C++11 being finalized. Here is a somewhat clunky way to work around it: https://wandbox.org/permlink/HQPGEOMAUXwUCMb4 – Howard Hinnant Sep 24 '17 at 17:25
  • 1
    Am I correct that using `return_vector(std::vector& x)` + `vector::resize(0)` could save a few mallocs across multiple `return_vector` calls, and be even more time efficient than NRVO, at the cost of larger memory usage overall? More precise code at: https://stackoverflow.com/questions/10476665/avoiding-copy-with-the-return-statement/53520381#53520381 – Ciro Santilli新疆棉花TRUMP BAN BAD Nov 28 '18 at 13:33
  • 2
    @CiroSantilli新疆改造中心六四事件法轮功 Yes you are correct. And you are also correct that counting allocations/deallocations is a good technique for estimating performance. – Howard Hinnant Nov 28 '18 at 13:50
  • @HowardHinnant In this article - https://www.ibm.com/developerworks/community/blogs/5894415f-be62-4bc0-81c5-3956e82276f3/entry/RVO_V_S_std_move?lang=en is the author actually recommending (2) ? You are saying it will crash. He is saying it works just fine. Am I incorrect ? – gansub Sep 02 '19 at 04:50
  • The author says after this code: "(Note: We should not use this way in the real development, because it is a reference to a local object. Here just show how to make RVO happened.)." I agree that the author's wording is easy to misinterpret when skimming. – Howard Hinnant Sep 02 '19 at 15:30
44

None of them will copy, but the second will refer to a destroyed vector. Named rvalue references almost never exist in regular code. You write it just how you would have written a copy in C++03.

std::vector<int> return_vector()
{
    std::vector<int> tmp {1,2,3,4,5};
    return tmp;
}

std::vector<int> rval_ref = return_vector();

Except now, the vector is moved. The user of a class doesn't deal with it's rvalue references in the vast majority of cases.

Puppy
  • 138,897
  • 33
  • 232
  • 446
  • Are you really sure that the third example is going to do vector copy ? – Tarantula Feb 13 '11 at 20:41
  • @Tarantula: It's going to bust your vector. Whether or not it did or didn't copy it before breaking doesn't really matter. – Puppy Feb 13 '11 at 20:50
  • 4
    I don't see any reason for the busting you propose. It is perfectly fine to bind a local rvalue reference variable to an rvalue. In that case, the temporary object's lifetime is extended to the lifetime of the rvalue reference variable. – fredoverflow Feb 13 '11 at 22:09
  • 1
    Just a point of clarification, since I'm learning this. In this new example, the vector `tmp` is not *moved* into `rval_ref`, but written directly into `rval_ref` using RVO (i.e. copy elision). There is a distinction between `std::move` and copy elision. A `std::move` may still involve some data to be copied; in the case of a vector, a new vector is actually constructed in the copy constructor and data is allocated, but the bulk of the data array is only copied by copying the pointer (essentially). The copy elision avoids 100% of all copies. – Mark Lakata Dec 11 '14 at 01:26
  • @MarkLakata This is NRVO, not RVO. NRVO is optional, even in C++17. If it is not applied, both return value and `rval_ref` variables are constructed using move constructor of `std::vector`. There is no copy constructor involved both with / without `std::move`. `tmp` is treated as an _rvalue_ in `return` statement in this case. – Daniel Langr Feb 20 '18 at 13:13
  • @DanielLangr is correct. In this case, because the return value is named `tmp`, then NRVO might apply (or may not, since it is optional). If `return_vector` was simply `{return std::vector{1,2,3,4,5};` it would be RVO. My point was that with a decent compiler that can do RVO and NRVO, `rval_ref` is not copy constructed or move constructed - it is directly constructed as `std::vector{1,2,3,4,5}`. – Mark Lakata Feb 22 '18 at 03:29
16

The simple answer is you should write code for rvalue references like you would regular references code, and you should treat them the same mentally 99% of the time. This includes all the old rules about returning references (i.e. never return a reference to a local variable).

Unless you are writing a template container class that needs to take advantage of std::forward and be able to write a generic function that takes either lvalue or rvalue references, this is more or less true.

One of the big advantages to the move constructor and move assignment is that if you define them, the compiler can use them in cases were the RVO (return value optimization) and NRVO (named return value optimization) fail to be invoked. This is pretty huge for returning expensive objects like containers & strings by value efficiently from methods.

Now where things get interesting with rvalue references, is that you can also use them as arguments to normal functions. This allows you to write containers that have overloads for both const reference (const foo& other) and rvalue reference (foo&& other). Even if the argument is too unwieldy to pass with a mere constructor call it can still be done:

std::vector vec;
for(int x=0; x<10; ++x)
{
    // automatically uses rvalue reference constructor if available
    // because MyCheapType is an unamed temporary variable
    vec.push_back(MyCheapType(0.f));
}


std::vector vec;
for(int x=0; x<10; ++x)
{
    MyExpensiveType temp(1.0, 3.0);
    temp.initSomeOtherFields(malloc(5000));

    // old way, passed via const reference, expensive copy
    vec.push_back(temp);

    // new way, passed via rvalue reference, cheap move
    // just don't use temp again,  not difficult in a loop like this though . . .
    vec.push_back(std::move(temp));
}

The STL containers have been updated to have move overloads for nearly anything (hash key and values, vector insertion, etc), and is where you will see them the most.

You can also use them to normal functions, and if you only provide an rvalue reference argument you can force the caller to create the object and let the function do the move. This is more of an example than a really good use, but in my rendering library, I have assigned a string to all the loaded resources, so that it is easier to see what each object represents in the debugger. The interface is something like this:

TextureHandle CreateTexture(int width, int height, ETextureFormat fmt, string&& friendlyName)
{
    std::unique_ptr<TextureObject> tex = D3DCreateTexture(width, height, fmt);
    tex->friendlyName = std::move(friendlyName);
    return tex;
}

It is a form of a 'leaky abstraction' but allows me to take advantage of the fact I had to create the string already most of the time, and avoid making yet another copying of it. This isn't exactly high-performance code but is a good example of the possibilities as people get the hang of this feature. This code actually requires that the variable either be a temporary to the call, or std::move invoked:

// move from temporary
TextureHandle htex = CreateTexture(128, 128, A8R8G8B8, string("Checkerboard"));

or

// explicit move (not going to use the variable 'str' after the create call)
string str("Checkerboard");
TextureHandle htex = CreateTexture(128, 128, A8R8G8B8, std::move(str));

or

// explicitly make a copy and pass the temporary of the copy down
// since we need to use str again for some reason
string str("Checkerboard");
TextureHandle htex = CreateTexture(128, 128, A8R8G8B8, string(str));

but this won't compile!

string str("Checkerboard");
TextureHandle htex = CreateTexture(128, 128, A8R8G8B8, str);
Yoon5oo
  • 489
  • 5
  • 11
Zoner
  • 586
  • 5
  • 5
4

Not an answer per se, but a guideline. Most of the time there is not much sense in declaring local T&& variable (as you did with std::vector<int>&& rval_ref). You will still have to std::move() them to use in foo(T&&) type methods. There is also the problem that was already mentioned that when you try to return such rval_ref from function you will get the standard reference-to-destroyed-temporary-fiasco.

Most of the time I would go with following pattern:

// Declarations
A a(B&&, C&&);
B b();
C c();

auto ret = a(b(), c());

You don't hold any refs to returned temporary objects, thus you avoid (inexperienced) programmer's error who wish to use a moved object.

auto bRet = b();
auto cRet = c();
auto aRet = a(std::move(b), std::move(c));

// Either these just fail (assert/exception), or you won't get 
// your expected results due to their clean state.
bRet.foo();
cRet.bar();

Obviously there are (although rather rare) cases where a function truly returns a T&& which is a reference to a non-temporary object that you can move into your object.

Regarding RVO: these mechanisms generally work and compiler can nicely avoid copying, but in cases where the return path is not obvious (exceptions, if conditionals determining the named object you will return, and probably couple others) rrefs are your saviors (even if potentially more expensive).

Red XIII
  • 5,057
  • 4
  • 22
  • 29
2

None of those will do any extra copying. Even if RVO isn't used, the new standard says that move construction is preferred to copy when doing returns I believe.

I do believe that your second example causes undefined behavior though because you're returning a reference to a local variable.

Edward Strange
  • 38,861
  • 7
  • 65
  • 123
1

As already mentioned in comments to the first answer, the return std::move(...); construct can make a difference in cases other than returning of local variables. Here's a runnable example that documents what happens when you return a member object with and without std::move():

#include <iostream>
#include <utility>

struct A {
  A() = default;
  A(const A&) { std::cout << "A copied\n"; }
  A(A&&) { std::cout << "A moved\n"; }
};

class B {
  A a;
 public:
  operator A() const & { std::cout << "B C-value: "; return a; }
  operator A() & { std::cout << "B L-value: "; return a; }
  operator A() && { std::cout << "B R-value: "; return a; }
};

class C {
  A a;
 public:
  operator A() const & { std::cout << "C C-value: "; return std::move(a); }
  operator A() & { std::cout << "C L-value: "; return std::move(a); }
  operator A() && { std::cout << "C R-value: "; return std::move(a); }
};

int main() {
  // Non-constant L-values
  B b;
  C c;
  A{b};    // B L-value: A copied
  A{c};    // C L-value: A moved

  // R-values
  A{B{}};  // B R-value: A copied
  A{C{}};  // C R-value: A moved

  // Constant L-values
  const B bc;
  const C cc;
  A{bc};   // B C-value: A copied
  A{cc};   // C C-value: A copied

  return 0;
}

Presumably, return std::move(some_member); only makes sense if you actually want to move the particular class member, e.g. in a case where class C represents short-lived adapter objects with the sole purpose of creating instances of struct A.

Notice how struct A always gets copied out of class B, even when the class B object is an R-value. This is because the compiler has no way to tell that class B's instance of struct A won't be used any more. In class C, the compiler does have this information from std::move(), which is why struct A gets moved, unless the instance of class C is constant.

Andrej Podzimek
  • 140
  • 1
  • 5