5

Consider the following code:

#include <utility>
#include <string>

int bar() {
    std::pair<int, std::string> p { 
        123, "Hey... no small-string optimization for me please!" };
    return p.first;
}

(simplified thanks to @Jarod42 :-) ...)

I expect the function to be implemented as simply:

bar():   
        mov eax, 123
        ret

but instead, the implementation calls operator new(), constructs an std::string with my literal, then calls operator delete(). At least - that's what gcc 9 and clang 9 do (GodBolt). Here's the clang output:

bar():                                # @bar()
        push    rbx
        sub     rsp, 48
        mov     dword ptr [rsp + 8], 123
        lea     rax, [rsp + 32]
        mov     qword ptr [rsp + 16], rax
        mov     edi, 51
        call    operator new(unsigned long)
        mov     qword ptr [rsp + 16], rax
        mov     qword ptr [rsp + 32], 50
        movups  xmm0, xmmword ptr [rip + .L.str]
        movups  xmmword ptr [rax], xmm0
        movups  xmm0, xmmword ptr [rip + .L.str+16]
        movups  xmmword ptr [rax + 16], xmm0
        movups  xmm0, xmmword ptr [rip + .L.str+32]
        movups  xmmword ptr [rax + 32], xmm0
        mov     word ptr [rax + 48], 8549
        mov     qword ptr [rsp + 24], 50
        mov     byte ptr [rax + 50], 0
        mov     ebx, dword ptr [rsp + 8]
        mov     rdi, rax
        call    operator delete(void*)
        mov     eax, ebx
        add     rsp, 48
        pop     rbx
        ret
.L.str:
        .asciz  "Hey... no small-string optimization for me please!"

My question is: Clearly, the compiler has full knowledge of everything going on inside bar(). Why is it not "eliding"/optimizing the string away? More specifically:

  1. At the basic level there's the code between then new() and delete(), which AFAICT the compiler knows results in nothing useful.
  2. Secondarily, the new() and delete() calls themselves. After all, small-string-optimization is allowed by the standard AFAIK, so even though clang/gcc hasn't chosen to use that - it could have; meaning that it's not actually required to call new() or delete() there.

I'm particularly interested in what part of this is directly due to the language standard, and what part is compiler non-optimality.

einpoklum
  • 86,754
  • 39
  • 223
  • 453
  • 1
    Isn't `new`/`delete` observable behavior? (and `new` can `throw`) – Jarod42 Mar 23 '20 at 15:06
  • @Jarod42: 1. I'm not sure it is, in C++20, if you can have constexpr vectors. 2. Even if it is - if you're allowed, in theory, to use a small-string-optimization with upto 1,000,000 characters, that means you're allowed different observable behaviors for `foo()` and therefore for `bar()`. – einpoklum Mar 23 '20 at 15:13
  • 1
    According to https://en.cppreference.com/w/cpp/language/new#Allocation, allocation can be elided only since C++14. – Jarod42 Mar 23 '20 at 15:21
  • [inconsistent-behavior-of-compiler-optimization-of-unused-string](https://stackoverflow.com/a/56425395/2684539) show issue with compilers for `delete new int;` – Jarod42 Mar 23 '20 at 15:24
  • I would say that SSO should be consistent, else it is mainly an ODR violation. – Jarod42 Mar 23 '20 at 16:02
  • @Jarod42: "I would say" - what does the standard say about that, actually? If there isn't a strict consistence requirement, it won't be an ODR violation, because SSO could depend on how many times you've ever instantiated a string, or the phase of the moon, or whatever. – einpoklum Mar 23 '20 at 16:05
  • @Jarod42 "_allocation can be elided only since C++14_" Only if std functions are replaced. Which the compiler doesn't know during separate compilation, of course. For whole program optimization, they could always be elided. – curiousguy Mar 23 '20 at 17:26
  • @curiousguy: Where do you see that restriction? *"`delete[] new int[10];` can be optimized out"*. but *"allocation"* is indeed too "general", as only *new-expressions* can be elided. – Jarod42 Mar 23 '20 at 18:04
  • @Jarod42 Maybe my phrasing was problematic: I only replied to "only since C++14" = "no optimization allowed by the std. But it's only relevant when there is *potentially* user code, that is a replaced (de)allocation function. Which the compiler can know w/ whole program optimization. – curiousguy Mar 23 '20 at 18:45

3 Answers3

3

Nothing in your code represents "elision" as that term is commonly used in a C++ context. The compiler is not permitted to remove anything from that code on the grounds of "elision".

The only grounds a compiler has to remove the creation of that string is on the basis of the "as if" rule. That is, is the behavior of the string creation/destruction visible to the user and therefore not able to be removed?

Since it uses std::allocator and the standard character traits, the basic_string construction and destruction itself is not being overridden by the user. So there is some basis for the idea that the string's creation is not a visible side-effect of the function call and thus could be removed under the "as if" rule.

However, because std::allocator::allocate is specified to call ::operator new, and operator new is globally replaceable, it is reasonable to argue that this is a visible side effect of the construction of such a string. And therefore, the compiler cannot remove it under the "as if" rule.

If the compiler knows that you have not replaced operator new, then it can in theory optimize the string away.

That doesn't mean that any particular compiler will do so.

Nicol Bolas
  • 378,677
  • 53
  • 635
  • 829
  • So a `new` expression can be elided (ie, skipped, with no call to operator `::new`) or combined starting in C++14; but (a) is `std::allocator` specified in a way compatible with that elision, (I hope so!) (b) does the standard actually require that `std::string` use that allocator to get memory at all? (with SBO, that wording would be tricky...) – Yakk - Adam Nevraumont Mar 23 '20 at 15:34
  • 1. I was using the term "elision" more broadly". Do you suggest I remove the term from the question? 2. So, the bottom line of your answer is that a compiler can legitimately compile `bar()` to my desired two-liner? – einpoklum Mar 23 '20 at 15:34
  • @einpoklum: More details added. – Nicol Bolas Mar 23 '20 at 15:37
  • " And therefore, the compiler cannot remove it under the "as if" rule." – einpoklum Mar 23 '20 at 15:41
  • @einpoklum: But it is not required to *not* call it as well. – Nicol Bolas Mar 23 '20 at 15:43
  • 1
    But what I'm saying is that " And therefore, the compiler cannot remove it under the "as if" rule." is not true, because the compiler is not required to do any allocation in the first place, regardless of whether `operator new()` has been replaced. – einpoklum Mar 23 '20 at 15:48
  • @Yakk-AdamNevraumont: from cppreference *"`delete[] new int[10];` can be optimized out, but `operator delete(operator new(10));` cannot."* and `std::allocator::allocate` has to call `::operator new`... so `std::allocator` is incompatible with that elision. – Jarod42 Mar 23 '20 at 16:12
  • 1
    @Jarod42 Because small str optimization is allowed, it's obvious that doing allocations isn't part of the specification of `std::string`. – curiousguy Mar 23 '20 at 19:05
  • Yes, looks like [\[expr.new/12\]](//eel.is/c++draft/expr.new#12) "*An implementation is allowed to omit a call to a replaceable global allocation function ([new.delete.single], [new.delete.array]). When it does so, the storage is instead provided by the implementation or provided by extending the allocation of another new-expression.*" is a bit too tightly scoped to be applicable here. A shame really. – Deduplicator Mar 23 '20 at 20:15
  • @Deduplicator: Doesn't the "as if" rule also apply to "providing storage"? – einpoklum Mar 23 '20 at 20:41
  • "Doesn't the "as if" rule also apply to "providing storage"?" Yes, that is not observable. However, it is not mandated that replaced "operator new" only provide storage and not do anything observable. So you need language which allows it not to be called. – Jeff Garrett Mar 23 '20 at 21:37
  • @JeffGarrett: But I wasn't talking about a replaced operator new, but rather on an implementation providing storage without calling any operator new. – einpoklum Mar 23 '20 at 22:05
  • 1
    @Jarod42 I read that too, but I wouldn't trust cppreference for that level of detail. I'd want to check std allocator's specification. And that was just argument 1; argument 2 (that std string isn't saying it allocates) is strong enough to carry it. – Yakk - Adam Nevraumont Mar 23 '20 at 23:16
  • 2
    You shouldn't try to "read between the lines" of cppreference - we aren't written for language-lawyer-level fine parsing. But if we explicitly say that the compiler isn't allowed to optimize something out, there's a very good chance that it isn't. And if we explicitly say that [the implementation is allowed to optimize something out](https://en.cppreference.com/w/cpp/memory/allocator/allocate#Notes), there's a very good chance that it is. @Yakk-AdamNevraumont – T.C. Mar 24 '20 at 14:59
  • @T.C. Ah nice, I was figuring there was something like that in the standard. Because it would be dumb if not (not because I found it). – Yakk - Adam Nevraumont Mar 24 '20 at 18:59
2

Following the discussion in various answers and comments here, I have now filed the following bugs against GCC and LLVM regarding this issue:

  1. GCC bug 94293: [missed optimization] new+delete of unused local string not removed

    Minimal testcase (GodBolt):

    void foo() {
        int *p = new int[1];
        *p = 42;
        delete[] p;
    }
    
  2. GCC bug 94294: [missed optimization] Useless statements populating local string not removed

    Minimal testcase (GodBolt):

    void foo() {
        std::string s { "This is not a small string" };
    }
    
  3. LLVM bug 45287: [missed optimization] failure to drop unused libstdc++ std::string.

    Minimal testcase (GodBolt):

    void foo() {
        std::string s { "This is not a small string" };
    }
    

Thanks goes to: @JeffGarret, @NicolBolas, @Jarod42, Marc Glisse .

einpoklum
  • 86,754
  • 39
  • 223
  • 453
1

The question is can the program

int bar() {
    std::pair<int, std::string> p { 
        123, "Hey... no small-string optimization for me please!" };
    return p.first;
}

be validly be optimized to

int bar() {
    return 123;
}

tldr, yes, I think.

And clang does with libc++: godbolt

About std::string, the standard says string.require/3

Every object of type basic_­string uses an object of type Allocator to allocate and free storage for the contained charT objects as needed.

"as needed". std::string is allowed to decide when to use the allocator (which is I believe the justification for SSO being valid). Its member functions do not mandate an allocation. Therefore the allocation may be elided.

Jeff Garrett
  • 3,517
  • 1
  • 10
  • 10
  • First, I wasn't aware GodBolt's clang++ defaulted to libstdc++, so thanks for that. But, why is it that it's willing to optimize the string away with libc++, but not with libstdc++? I mean, the implementations can't be _that_ different, can they? – einpoklum Mar 23 '20 at 20:39
  • 1
    When researching this, I found implications in the bug trackers that (1) gcc is unable to optimize away allocations generally and (2) clang is thwarted by the simplest of try/catch. libstdc++ does have a little try/catch in between the new and delete, as far as I can tell. – Jeff Garrett Mar 23 '20 at 21:27
  • I say implications, because I was unable to find multiple sources or corroborate them. For example, in this gcc bug (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=71258), it says "GCC cannot eliminate calls to 'new' currently." ... But it's old (2016). But it's also still open. – Jeff Garrett Mar 23 '20 at 21:30
  • This bug for LLVM shows even the most trivial of try/catch's, handled in the same function, with complete knowledge available to clang, cannot be optimized away (and one could reason that this would interfere with collapsing logic before and after such a construct): https://bugs.llvm.org/show_bug.cgi?id=35052 – Jeff Garrett Mar 23 '20 at 21:33
  • Apparently, it's not the exceptions that are the cause of the problem. See the links in my "answer". – einpoklum Mar 23 '20 at 22:56