Move Semantics and Pass-by-Rvalue-Reference in Overloaded Arithmetic

Question

I am coding a small numeric analysis library in C++. I have been trying to implement using the latest C++11 features including move semantics. I understand the discussion and top answer at the following post: C++11 rvalues and move semantics confusion (return statement) , but there is one scenario that I still am trying to wrap my head around.

I have a class, call it T, which is fully equipped with overloaded operators. I also have both copy and move constructors.

T (const T &) { /*initialization via copy*/; }
T (T &&) { /*initialization via move*/; }

My client code heavily uses operators, so I am trying to ensure that complex arithmetic expressions get maximum benefit from move semantics. Consider the following:

T a, b, c, d, e;
T f = a + b * c - d / e;

Without move semantics, my operators are making a new local variable using the copy constructor each time, so there are a total of 4 copies. I was hoping that with move semantics I could reduce this to 2 copies plus some moves. In the parenthesized version:

T f = a + (b * c) - (d / e);

each of (b * c) and (d / e) must create the temporary in the usual way with a copy, but then it would be great if I could leverage one of those temporaries to accumulate the remaining results with only moves.

Using g++ compiler, I have been able to do this, but I suspect my technique may not be safe and I want to fully understand why.

Here is an example implementation for the addition operator:

T operator+ (T const& x) const
{
    T result(*this);
    // logic to perform addition here using result as the target
    return std::move(result);
}
T operator+ (T&& x) const
{
    // logic to perform addition here using x as the target
    return std::move(x);
}

Without the calls to std::move, then only the const & version of each operator is ever invoked. But when using std::move as above, subsequent arithmetic (after the innermost expressions) are performed using the && version of each operator.

I know that RVO can be inhibited, but on very computationally-expensive, real-world problems it seems that the gain slightly outweighs the lack of RVO. That is, over millions of computations I do get a very tiny speedup when I include std::move. Though in all honesty it is fast enough without. I really just want to fully comprehend the semantics here.

Is there a kind C++ Guru who is willing to take the time to explain, in a simple way, whether and why my use of std::move is a bad thing here? Many thanks in advance.

The second `move` is fine. Only the first one is unnecessary. — Kerrek SB, Oct 31 '12 at 19:25
There is a window of opportunity that is not exploited in the options above: when the left hand side is already a temporary. That can be taken advantage of by overloading the member function for rvalue-references. Also note that in general you should prefer free functions for operator overloading, in which case the overloads would be on the first or second arguments being rvalues. That requires 4 combinations (whether left/right hand side are rvalue/lvalue) — David Rodríguez - dribeas, Oct 31 '12 at 19:45
What kind of numeric type is `T` exactly? Does it manage something on the heap via pointers? Because if it only has a couple of `int` members or something, move semantics will not gain you anything. Just asking :) — fredoverflow, Nov 01 '12 at 05:55
@FredOverflow: :o Yes, each T manages a structure on the heap. :) — Tientuinë, Nov 02 '12 at 15:28

score 8 · Accepted Answer · answered Oct 31 '12 at 19:55

8

You should prefer overloading the operators as free functions to obtain full type symmetry (same conversions can be applied on the left and right hand side). That makes it a bit more obvious what you are missing from the question. Restating your operator as free functions you are offering:

T operator+( T const &, T const & );
T operator+( T const &, T&& );

But you are failing to provide a version that handles the left hand side being a temporary:

T operator+( T&&, T const& );

And to avoid ambiguities in the code when both arguments are rvalues you need to provide yet another overload:

T operator+( T&&, T&& );

The common advice would be to implement += as a member method that modifies the current object, and then write operator+ as a forwarder that modifies the appropriate object in the interface.

I have not really thought this much, but there might be an alternative using T (no r/lvalue reference), but I fear that it will not reduce the number of overloads you need to provide to make operator+ efficient in all circumstances.

answered Oct 31 '12 at 19:55

David Rodríguez - dribeas

192,922
20
275
473

This doesn't address whether use of `std::move` is appropriate or not. – ildjarn Oct 31 '12 at 20:01
@Xeo: Not sure that will make a difference if the type is *movable*. You are not making copies, but moving contents (which should be *fast*). Yes, it will inhibit copy elisions to the arguments... but if you are going through the pain of the multiple overloads I can only assume that the *move* cheap and much better than copies. – David Rodríguez - dribeas Oct 31 '12 at 20:07
Nevermind my last comment, I forgot for a sec that you can't overload `T const&` vs `T`. However, in theory, it's the same as with `operator=`. You can either make two overloads, with `(T const&)` and `T&&` or a single one with `(T)`. The same applies here, except for two arguments. In your `(T const&, T const&)`, you're making a copy inside the operator of one of parameters - make it in the params instead. Anyways seems you do need 4 overloads in the end. :( – Xeo Oct 31 '12 at 20:19
1

@Xeo: Yes, I don't quite remember the details but I tried something like this a few months ago and ended up needing 4 overloads. I don't remember the exact details but I spend a couple of hours trying to find the *Right Thing* to do. At the end I don't have a clear use case where moving is much better than copying, so I left that as an experiment and moved on (applying the same old C++03 approach in general, even if it might not be optimal in some cases). – David Rodríguez - dribeas Oct 31 '12 at 20:24
@Xeo: I was just thinking on your last comment, and in particular in the `(T const&,T const&)` case. Having the copy in the function call or inside the function makes absolutely no difference. The case where making the copy in the interface is more efficient is if the argument is itself a temporary (and the copy can thus be elided). In this case there are three other overloads that handle rvalue in either or both arguments, so the `(T const&,T const&)` will only be called with rvalues where the copy cannot be elided, and having it done by the caller or the function won't matter. – David Rodríguez - dribeas Oct 31 '12 at 21:15
@DavidRodríguez-dribeas: Thank you. I can't believe I missed the case of `T&&` on the left-hand side! Lesson learned. I actually don't rely on type conversion; I have explicit friend overloads for various operand types - only for `T const&` and `T&&` parameters were they member functions. My class is a wrapper for an existing C library - type conversion introduces too much overhead. In my problem domain I want to squeeze every possible cycle out this thing. I do still wonder about the `std::move` behavior in my case though. Will report back after I move all member operators out of the class. – Tientuinë Nov 01 '12 at 02:04

Andrew Durward · Answer 2 · 2012-11-03T03:23:12.333

5

To build on what others have said:

The call to std::move in T::operator+( T const & ) is unnecessary and could prevent RVO.
It would be preferable to provide a non-member operator+ that delegates to T::operator+=( T const & ).

I'd also like to add that perfect forwarding can be used to reduce the number of non-member operator+ overloads required:

template< typename L, typename R >
typename std::enable_if<
  std::is_convertible< L, T >::value &&
  std::is_convertible< R, T >::value,
  T >::type operator+( L && l, R && r )
{
  T result( std::forward< L >( l ) );
  result += r;
  return result;
}

For some operators this "universal" version would be sufficient, but since addition is typically commutative we'd probably like to detect when the right-hand operand is an rvalue and modify it rather than moving/copying the left-hand operand. That requires one version for right-hand operands that are lvalues:

template< typename L, typename R >
typename std::enable_if<
  std::is_convertible< L, T >::value &&
  std::is_convertible< R, T >::value &&
  std::is_lvalue_reference< R&& >::value,
  T >::type operator+( L && l, R && r )
{
  T result( std::forward< L >( l ) );
  result += r;
  return result;
}

And another for right-hand operands that are rvalues:

template< typename L, typename R >
typename std::enable_if<
  std::is_convertible< L, T >::value &&
  std::is_convertible< R, T >::value &&
  std::is_rvalue_reference< R&& >::value,
  T >::type operator+( L && l, R && r )
{
  T result( std::move( r ) );
  result += l;
  return result;
}

Finally, you may also be interested in a technique proposed by Boris Kolpackov and Sumant Tambe as well as Scott Meyers' response to the idea.

edited Nov 03 '12 at 03:23

answered Nov 02 '12 at 21:59

Andrew Durward

3,471
1
16
30

Wonderful articles, thank you! At some point I will attempt to reduce the ridiculous number of overloads I have, maybe via perfect forwarding, but since everything is now working well I'm not in any rush. – Tientuinë Nov 06 '12 at 03:52
Durwald: I have one question about your two `operator+` overloads. Using g++-4.7, if I try to make `operator+` a friend of class `T` (to access private data members) then the compiler complains about access to private members. Apparently, it does not see the implementations as specializations of the friend declaration `template friend T operator+ (L&&, R&&)`. If I leave `T` out of the friend declaration, then it still sees them as different and complains about ambiguous overload. Where am I going wrong? (Maybe I should ask this as a new question.) – Tientuinë Nov 06 '12 at 05:54
@Tientuinë You must use the same signature in order to declare the non-member operators as friends of `T`, namely `template< typename L, typename R > friend typename std::enable_if< ... >::type operator+( L &&, R && );` – Andrew Durward Nov 06 '12 at 18:09
At first, I did just that. The problem is that there is ambiguity between the two overloads, and the compiler complains. That's why I tried adding a third template parameter and making the overloads into specializations, but that didn't work either. P.S. The name-autocomplete failed me earlier. Sorry about the misspelling. – Tientuinë Nov 06 '12 at 22:55
@Tientuinë It seems to work for me with [LWS](http://liveworkspace.org/code/ceb4afbf9ceadef0234f7743b9b4454f) (gcc 4.7.2). I can only suggest starting a new question if you're still having trouble. – Andrew Durward Nov 07 '12 at 01:24

KnowItAllWannabe · Answer 3 · 2012-10-31T23:05:10.083

3

I agree with David Rodríguez that it'd be a better design to use non-member operator+ functions, but I'll set that aside and focus on your question.

I'm surprised that you see a performance degradation when writing

T operator+(const T&)
{
  T result(*this);
  return result;
}

instead of

T operator+(const T&)
{
  T result(*this);
  return std::move(result);
}

because in the former case, the compiler should be able to use RVO to construct result in the memory for the function's return value. In the latter case, the compiler would need to move result into the function's return value, hence incur the extra cost of the move.

In general, the rules for this kind of thing are, assuming you have a function returning an object (i.e., not a reference):

If you're returning a local object or a by-value parameter, don't apply std::move to it. That permits the compiler to perform RVO, which is cheaper than a copy or a move.
If you're returning a parameter of type rvalue reference, apply std::move to it. That turns the parameter into an rvalue, hence permitting the compiler to move from it. If you just return the parameter, the compiler must perform a copy into the return value.
If you're returning a parameter that's a universal reference (i.e., a "&&" parameter of deduced type that could be an rvalue reference or an lvalue reference), apply std::forward to it. Without it, the compiler must perform a copy into the return value. With it, the compiler can perform a move if the reference is bound to an rvalue.

edited Oct 31 '12 at 23:05

answered Oct 31 '12 at 22:54

KnowItAllWannabe

11,380
6
41
85

Thanks for your help. As I replied to David R above, I actually do have friend overloads for many operand types, but for the two I mentioned I made them members. I guess I should just make all of them friends, because I missed the left-hand `T&&` case. As for the performance difference it is slightly less than 1%, so very small but measurable. I just don't understand why the compiler is doing neither RVO nor move semantics where it clearly should be able to do at least one of those. If I explicitly include move, then that seems to do the trick, but I worry its's not safe. – Tientuinë Nov 01 '12 at 02:11
@Tientuinë: Whether you implement them as member functions or free functions does not really matter. You can overload on the lvalue-rvalue-reference-*ness* of the `this` pointer. As of RVO/NRVO, it depends on some compiler and flags (some compilers need some optimizations enabled), and the code. Note that in the case where the parameter is `T&&` the compiler will not implicitly move from an argument to the returned value. – David Rodríguez - dribeas Nov 01 '12 at 02:39
@KnowItAllWannabe: I am trying to grok the difference between `std::move` and `std::forward` in my case. If I have `operator+(T&& x, T&& y)` is it unsafe to assume that I can return `x` using `std::move`? I actually modify `x` before returning it, so if I can't `move` it then it is probably not safe to modify it either, both of which would be a big let down for my implementation. – Tientuinë Nov 01 '12 at 02:42
@Tientuinë: `std::move(x)` is `static_cast(x)`, that is it casts `x` to an rvalue-reference. `std::forward(x)` on the other hand will generate an *rvalue-reference* or an *lvalue-reference* depending on the argument (i.e. it will yield an lvalue-reference if the argument is an lvalue, or an rvalue-reference otherwise. They serve different purposes, `move` guarantees *rvalue-reference*, `forward` makes a choice. As of returning, if inside the function the argument is a reference, you can `move` and it will guarantee move construction of the returned value. – David Rodríguez - dribeas Nov 01 '12 at 02:49
@DavidRodríguez-dribeas: Okay. I think I follow that. But how about the rules for overload selection? When I have a statement like `c = a + b;`, all variables having type `T`, then am I guaranteed that the `operator+(T const&, T const&)` is called instead of `operator+(T&&, T&&)`? Since the latter version is going to move one of the arguments, I don't want it called in this case, but I do want it called when the arguments are temporaries. – Tientuinë Nov 01 '12 at 03:09
@Tientuinë: `a` and `b` are both *lvalue*s so the *rvalue-reference* cannot bind to them. The whole point of *rvalue-references* is being able to detect when the argument is an *rvalue* or an *lvalue*. An *rvalue* could potentially be bound by either a `const` *lvalue* reference or an *rvalue-reference* but the language determines that the later takes precedence so that overload resolution will pick the *best* option. – David Rodríguez - dribeas Nov 01 '12 at 03:34
@Tientuinë: If the performance difference is less than 1%, I'd consider that an unreliable difference and view the performance as essentially identical. It sounds like your compiler(s), for whatever reason, are just not performing the RVO. You might play around with the optimization and debug settings to see if it makes any difference, but I doubt that a <1% difference is statistically significant. – KnowItAllWannabe Nov 01 '12 at 06:25
@KnowItAllWannabe: As I mentioned in my OP, performance was already satisfactory; I mainly wanted to fully understand the concepts. Thanks to you and David, I think I'm (nearly) there. As for the time difference, it's not the size that determines statistical significance. I have run many iterations, so my sample size is quite large enough to be sure that the difference is reliable. Actually, after several new improvements, including taking better advantage of rvalue-references, I have pushed the speedup to ~4% and probably all I'm going to squeeze out of this. Thanks for your time. – Tientuinë Nov 02 '12 at 05:21

Move Semantics and Pass-by-Rvalue-Reference in Overloaded Arithmetic

3 Answers3

Linked