13

I was investigating the performance of moving std::string. For the longest time, I've regarded string moves as almost free, thinking the compiler will inline everything and it will only involve a few cheap assignments.

In fact, my mental model for moving is literally

string& operator=(string&& rhs) noexcept
{
    swap(*this, rhs);
    return *this;
}

friend void swap(string& x, string& y) noexcept
{
    // for disposition only
    unsigned char buf[sizeof(string)];
    memcpy(buf, &x, sizeof(string));
    memcpy(&x, &y, sizeof(string));
    memcpy(&y, buf, sizeof(string));
}

To the best of my understanding, this is a legal implementation if the memcpy is changed to assigning individual fields.

It is to my great surprise to find gcc's implementation of moving involves creating a new string and might possibly throw due to the allocations despite being noexcept.

Is this even conforming? Equally important, should I not think moving is almost free?


Bewilderingly, std::vector<char> compiles down to what I'd expect.

clang's implementation is much different, although there is a suspicious std::string::reserve

Passer By
  • 16,942
  • 5
  • 38
  • 81
  • 1
    With SSO (short string optimization) moving could copy as you can't move an array. – NathanOliver May 29 '18 at 14:28
  • @NathanOliver I am aware, but that gets covered by the `memcpy`. At the very least, I'd expect a buffer of size `sizeof(string)` for temporarily copying the contents of the array to and from, not the creation of a temporary string – Passer By May 29 '18 at 14:29
  • @PasserBy NathanOliver's comment is applicable to the *"should I not think moving is almost free?"* portion of your question. Moves can be expensive for some types and class types are allowed to have throwing move constructors and move assignment operators.. – François Andrieux May 29 '18 at 14:53
  • @PasserBy See what happens for a test case where you actually use `foo`. I believe the whole thing will collapse down at that point to a few memory operations, like you expect. I can't explain why the produced assembly appears to so much stuff. – François Andrieux May 29 '18 at 15:01
  • @FrançoisAndrieux `foo` is the best way I can think of to isolate and use the move. The parameters are only references and won't have any constructor/destructor. The function is global with side effects and must be compiled to something. – Passer By May 29 '18 at 15:09
  • @FrançoisAndrieux Some types can have throwing moves, but `std::string` is explicitly `noexcept` as verified by the `static_assert`. Also explains the call to `std::terminate` in clang's output. – Passer By May 29 '18 at 15:12
  • Unsure whether it is related, but after a move construction or move assignment, the rvalue is required to be *left in a valid state with an unspecified value* [string.cons]. In particular, I assume that `data()` should be a non null pointer. – Serge Ballesta May 29 '18 at 15:20
  • @SergeBallesta: You cannot assume the contents of `data`. That's what "unspecified value" means. – Nicol Bolas May 29 '18 at 15:21
  • @NicolBolas: The *valid state* let me think that it cannot be below a default constructed string, that is *data() [is] a non-null pointer that is copyable and can have 0 added to it* – Serge Ballesta May 29 '18 at 15:24
  • 2
    Add ` -stdlib=libc++` to the clang window. – Howard Hinnant May 29 '18 at 15:34
  • @HowardHinnant I realized my mistake just as you commented ;) – Passer By May 29 '18 at 15:37
  • 2
    You still found a bug in libc++! `basic_string::__clear_and_shrink()` should be marked `noexcept`. With that addition the move assignment operator cleans up pretty nicely. – Howard Hinnant May 29 '18 at 15:44
  • @HowardHinnant Does `basic_string::reserve` call that with an argument of `0`? I didn't see a `basic_string::__clear_and_shrink` there – Passer By May 29 '18 at 15:48
  • 1
    `__clear_and_shrink` is evidently a recent edit. I was looking at the tip-of-trunk libc++: https://github.com/llvm-mirror/libcxx/blob/master/include/string#L2120-L2134 – Howard Hinnant May 29 '18 at 15:51

2 Answers2

1

I've only analyzed GCC's version. Here's what's going on: the code handles different kind of allocators. If the allocator has the trait of _S_propagate_on_move_assign or _S_always_equal, then the move is almost free, as you expect. This is the if in move operator=:

if (!__str._M_is_local()
    && (_Alloc_traits::_S_propagate_on_move_assign()
      || _Alloc_traits::_S_always_equal()))
          // cheap move
else assign(__str);

If the condition is true (_M_is_local() means small string, description here), then the move is cheap.

If it is false, then it calls normal assign (not the moving one). This is the case when either:

  • the string is small, so the assign will do a simple memcpy (cheap)
  • or the allocator doesn't have the trait always-equal nor propagate-on-move-assign, so the assign will allocate (not cheap)

What does this mean?

It means, that if you use the default allocator (or any allocator with traits mentioned earlier), then the move is still almost free.

On the other hand, the generated code is unnecessarily huge, and can be improved I think. It should have a separate code for handling usual allocators, or have a better assign code (the problem is that assign doesn't check for _M_is_local(), but it does a capacity check, so the compiler cannot decide whether an allocation is needed or not, so it puts the allocation codepath into the executable unnecessarily - you can check out the exact details in the source code).

geza
  • 26,117
  • 6
  • 47
  • 111
  • `_Alloc_traits::_S_propagate_on_move_assign` and `_Alloc_traits::_S_always_equal` are `constexpr` functions. Should not the branches on these be eliminated at compile time and not be present in the assembly? – Maxim Egorushkin May 30 '18 at 11:45
  • @MaximEgorushkin: they are not present in the assembly. It branches on the locality attribute first (the code I included in my answer), so the `assign` code has to be compiled in. Then, the `assign` code doesn't check the locality attribute, but it checks the capacity (this is understandable from the point of `assign`, because the capacity matters there, and not locality). And the compiler is not smart enough (understandably) to correlate these two things together, so it puts the allocation codepath into the exe. – geza May 30 '18 at 11:49
  • You might want to mention `_M_is_local` is for small string optimizations. – Passer By May 30 '18 at 13:28
0

Not exactly an answer, but this is the new implementation of C++11 std::string without the reference counter and with small string optimisation what causes voluminous assembly. Particularly, the small string optimisation causes 4 branches to handle 4 different combinations of lengths of the source and the target of the move assignment.

When -D_GLIBCXX_USE_CXX11_ABI=0 option is added to use the pre C++-11 std::string with a reference counter and no small string optimisation the assembly code looks much better.


should I not think moving is almost free?

In Nothing is Better than Copy or Move by Roger Orr talk, slides page 47 it says:

Comparison of copy and move

  • Many people incorrectly think of move as effectively 'free'
  • The performance difference between copy and move varies widely
  • For a primitive type, such as int, copying or moving are effectively identical
  • Move is faster than copy when only part of the object needs to be transferred to transfer the whole value
Saeed Masoomi
  • 1,267
  • 13
  • 26
Maxim Egorushkin
  • 119,842
  • 14
  • 147
  • 239