76

Can someone explain it in a language that mere mortals understand?

curiousguy
  • 7,344
  • 2
  • 37
  • 52
Yakov Galka
  • 61,035
  • 13
  • 128
  • 192
  • 4
    http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2008/n2643.html – DumbCoder Jun 20 '11 at 12:45
  • 1
    @DumbCoder: thank you, this is definitely better than N2390 itself, unfortunately it redirects to a lot of other papers that are "necessary to understanding this proposal"... Seems like my question is overly broad :) – Yakov Galka Jun 20 '11 at 12:52
  • 1
    In normal language, it is an optional optimization hint (which is currently either unimplemented or ignored by every compiler) that may in theory allow a compiler to generate slightly better multithreaded code when rarely modified, frequently read data is shared. Good job that the wording is so contorted that nobody will ever use it anyway :-) – Damon Jun 20 '11 at 13:04
  • I also point you to this question in which a nice book by Anthony Williams is mentioned: http://stackoverflow.com/questions/4938258/where-can-i-find-good-solid-documentation-for-the-c0x-synchronization-primitiv – Omnifarious Jun 20 '11 at 13:32
  • 1
    @Damon It isn't that the wording is contorted, it's that the semantic is utterly silly: `d?a:b` breaks dependency, but `d->static_fun()` does not... that makes no sense. And it doesn't allow "slightly better multithreaded code", avoiding a fence for a frequent operation is significantly better on some processors. "_when rarely modified, frequently read data is shared_" Consume is applicable to frequently modified data too, as long as there is a pointer to it, and the record is read only once published, which is the norm anyway. – curiousguy Dec 10 '19 at 02:59

2 Answers2

57

[[carries_dependency]] is used to allow dependencies to be carried across function calls. This potentially allows the compiler to generate better code when used with std::memory_order_consume for transferring values between threads on platforms with weakly-ordered architectures such as IBM's POWER architecture.

In particular, if a value read with memory_order_consume is passed in to a function, then without [[carries_dependency]], then the compiler may have to issue a memory fence instruction to guarantee that the appropriate memory ordering semantics are upheld. If the parameter is annotated with [[carries_dependency]] then the compiler can assume that the function body will correctly carry the dependency, and this fence may no longer be necessary.

Similarly, if a function returns a value loaded with memory_order_consume, or derived from such a value, then without [[carries_dependency]] the compiler may be required to insert a fence instruction to guarantee that the appropriate memory ordering semantics are upheld. With the [[carries_dependency]] annotation, this fence may no longer be necessary, as the caller is now responsible for maintaining the dependency tree.

e.g.

void print(int * val)
{
    std::cout<<*val<<std::endl;
}

void print2(int * [[carries_dependency]] val)
{
    std::cout<<*val<<std::endl;
}

std::atomic<int*> p;
int* local=p.load(std::memory_order_consume);
if(local)
    std::cout<<*local<<std::endl; // 1

if(local)
    print(local); // 2

if(local)
    print2(local); // 3

In line (1), the dependency is explicit, so the compiler knows that local is dereferenced, and that it must ensure that the dependency chain is preserved in order to avoid a fence on POWER.

In line (2), the definition of print is opaque (assuming it isn't inlined), so the compiler must issue a fence in order to ensure that reading *p in print returns the correct value.

On line (3), the compiler can assume that although print2 is also opaque then the dependency from the parameter to the dereferenced value is preserved in the instruction stream, and no fence is necessary on POWER. Obviously, the definition of print2 must actually preserve this dependency, so the attribute will also impact the generated code for print2.

Anthony Williams
  • 62,015
  • 12
  • 122
  • 149
  • 18
    This is a great answer. But... how would you go about coding the function to preserve the dependency? What would an improperly coded function look like and what would the consequences be? – Omnifarious Jun 20 '11 at 13:20
  • 2
    BTW, I got a pre-release copy of your book as a PDF. It is a fantastic book. I really wish you had carried on your 'person in a cubicle receiving phone calls' metaphor all the way through though. That was a great tool for understanding what was going on. – Omnifarious Jun 20 '11 at 13:31
  • 3
    From the POV of the source, all you need to do is use the `[[carries_dependency]]` attribute, and not call `std::kill_dependency` unless you mean it. The compiler will then ensure that it doesn't break the dependency chain in the generated code. – Anthony Williams Jun 20 '11 at 16:31
  • 9
    @AnthonyWilliams: I'm with Omnifarious here: it sounds like you just have to plaster all the function declarations with `[[carries_dependency]]` and the compiler will magically generate faster code. I'd be interested in an example function where you *cannot* use `[[carries_dependency]]` or where you'd have to use `std::kill_dpendency`. – Marc Mutz - mmutz Jul 02 '12 at 17:40
  • 2
    @MarcMutz-mmutz "_the compiler will magically generate faster code_" Wrong. The compiler will generate equal, or less optimized (slower) code. – curiousguy Jul 29 '12 at 06:30
  • 2
    @MarcMutz-mmutz : Indeed, you're forgetting about the cases where `print2` is called by a single-threaded caller (actually, technically, with any argument that wasn't obtained via a consume chain); maintaining the dependency chain is now overhead. – ildjarn Aug 14 '15 at 01:51
  • 1
    In practice, can you give an example of what a dependency failure might look like? The value of `int* local` would be undefined at the point of use? Second, I presume `[[carries_dependency]]` does not change the correctness of the generated code (this is in general true of attributes by design, I believe)? – Yakk - Adam Nevraumont Sep 14 '15 at 17:37
  • 1
    Should those `*p` in `print` and `print2` actually be `*val`? – Tom Tanner Jul 13 '17 at 13:10
  • 1
    I recall there being the 'restricted' keyword in C. Does that somehow relate to this issue? – NTAuthority Sep 03 '18 at 13:32
  • @NTAuthority Modern pointer semantic (see all my Q re: pointers) (note that modern ptr semantics implies that no impl w/ flat addressing is conforming), restrict, and consume: they all depend on the way a value is obtained. The meaning of a value is not the mathematical number represented. If you have restrict ptr, you must use one and not the other, even when they have the same representation and "value". Same with values from consume operations, same with pointers that compare equal but don't point to the same object. – curiousguy Dec 10 '19 at 02:53
  • 1
    Note that `memory_order_consume` is temporarily deprecated, and current compilers treat it as `acquire` (losing the efficiency benefit on weakly-ordered ISAs). [\[\[carries\_dependency\]\] what it means and how to implement]([\[\[carries\_dependency\]\] what it means and how to implement](https://stackoverflow.com/q/64113244)) is my attempt to explain `[[carries_dependency]]` and the underlying hardware feature (dependency ordering) that `mo_consume` is based on. This is critical for understand the point of the C++11 design with carries_dependency and kill_dependency. – Peter Cordes Oct 13 '20 at 08:23
  • ARM is another good example use-case for dependency ordering: very widespread ISA, and before ARMv8 didn't even have cheap acquire. But yes, POWER works. – Peter Cordes Oct 13 '20 at 08:26
  • Actually [Memory order consume usage in C11](https://stackoverflow.com/q/55741148) is a better explanation of the hardware feature; that other answer I linked was mostly just links and debunking a wrong idea. Not super useful. – Peter Cordes Oct 13 '20 at 08:39
-2

In short, I think, if there are carries_dependency attribute, the generated code for a function should be optimized for a case, when the actual argument will really come from the another thread and carries a dependency. Similarly for a return value. There may be a lack of the performance if that assumption is not true (for example in single-thread program). But also absence of [[carries_dependency]] may result in bad performance in opposite case... No other effects but the performance alter should happen.

For example, the pointer dereference operation depends on how the pointer was previously obtained, and if the value of the pointer p comes from another thread (by "consume" operation) the value previously assigned by that another thread to *p are taken in account and visible. There may be another pointer q which is equal p (q==p), but as its value does not come from that other thread, the value of *q may seen be different from the *p. Actually *q may provoke a sort of "undefined behavior" (because access memory location out of coordination with the another thread which made assignment).

Really, it seems there are some big bug in the functionality of the memory (and the mind) in certain engineering cases.... >:-)