5

Is there any library with STL functions like std::sort(), std::binary_search(), std::lower_bound(), std::upper_bound() accepting 3-way comparison predicates (which return -1 on less, 0 on equal, 1 on great) instead of less predicate (true on less, false on equal or great) ?

Of course, the less predicate can be easily made out from existing 3-way predicate (like [](A a, B b) { return compare3(a,b)<0; }) but this results in extra number of calls to the predicate.

littleadv
  • 19,072
  • 2
  • 31
  • 46
user222202
  • 523
  • 4
  • 13
  • 4
    What's the point? You have to test twice the return value anyway. These predicates shouldn't be heavy, they're supposed to be simple and inline, so that repetitive calls won't be penalized. – littleadv May 22 '11 at 06:40
  • 1
    The point is: the result of predicate to check is `int`, which is cheap to test (even more, checking int for <0, ==0 and >0 can be optimized by compiler into one check). The comparing objects can be arbitrary heavy (strings, complex objects). – user222202 May 22 '11 at 06:50
  • @user222202 - and what algorithm will make use of this triple check? – littleadv May 22 '11 at 07:26
  • @littleadv - `binary_search()` and `*_bound()` can exit their cycles immediately on compare3 returning 0. also the half-ranges for the next step will be 1 element smaller. It is a small advantage, but it is for free. – user222202 May 22 '11 at 07:49
  • 1
    @user222202: `*_bound` cannot because if it did it would return wrong results. regarding `binary_search`, see Knuths "The art of programming" for detailed analysis why this 'optimization' has dubious value. – Yakov Galka May 22 '11 at 08:27
  • @ybungalobill: `*_bound cannot because if it did it would return wrong results_` Generally you are right, it would be wrong if the array has duplicates. But for duplicates-free array (it is my app case, sorry what I have been thinking only on it) it is correct if `lower_bound` returns ptr to exact match and `upper_bound` returns (ptr to exact match)+1. And I would prefer to have a look at the value in the profiler :) – user222202 May 22 '11 at 08:33
  • 1
    @user222202 - you're aware of the fact that STL is a general-purpose library, with generic algorithms that anyone can use and expect reasonable (pre-defined) performance? You can always implement algorithms optimized to your requirements and have them perform better, but you cannot expect it from the generic library, because what's optimization for you is ruining results for someone else you wouldn't care about. Generic libraries cannot do that. – littleadv May 22 '11 at 18:38
  • @littleadv - please, reread my question. I am not looking for those functions inside STL, I am looking for another library which have those STL-like functions. – user222202 May 23 '11 at 09:26
  • If 3 way comparison was so useless and uninterresting it would not be included in C++ 20 and standard in other languages like java. Among other things it is useful to chain comparators. – Nicolas Bousquet Nov 17 '18 at 08:52

1 Answers1

4

If you look at the implementation of the above algorithms, you'll see that lower/upper_bound don't do 3-way branches at all, binary_search does only in the last iteration to check equality and about sort() I don't know but I'm almost sure it doesn't do 3-way branches too. So your 'optimization' won't give you any boost. The opposite is true, your comparisons will be slower.

Yakov Galka
  • 61,035
  • 13
  • 128
  • 192
  • That is why I am looking for the implementation adapted (and specially optimized) to use with 3-way predicate. – user222202 May 22 '11 at 06:57
  • @user222202 let's repeat it in other words: most of the algorithms based on comparisons need either only equality `==` or less-than ` – Yakov Galka May 22 '11 at 07:01
  • @user2222202: In fact, for some algorithms (lower_bound) the C++ standard requires only (*iter < value) to be defined and the (value < *iter) is not needed. – Yakov Galka May 22 '11 at 07:03
  • @ybungalobill, `binary_search` must make a second call with swapped arguments whenever the comparison returns false (typically 50% of the cases). So must binary tree searches for `map` and `set`. That is because elements that should be *inserted* or *removed* must be *identified*. For `sort()` on the other hand I don't think it matters, because all you need to do is correct where `a[i] < a[j]` but `i > j`. – Jo So May 03 '17 at 20:48
  • @JoSo: this is not true at all. The standard also requires at most `log2(N) + O(1)` comparisons. See [reference implementation here](http://en.cppreference.com/w/cpp/algorithm/lower_bound). Also I remember Knuth described it well in TAOCP 3rd volume. – Yakov Galka May 03 '17 at 20:53
  • @ybungalobill, I stand corrected. That's very sophisticated. It loses the the ability to return early, but I think that means on average it makes only one more comparison call. And the code is smaller. – Jo So May 03 '17 at 21:37
  • @ybungalobill, I don't see a translation to binary search trees though. Does my original claim still apply here? – Jo So May 03 '17 at 21:38
  • @JoSo: because `map` and `set` are usually implemented with the same code as `multimap` and `multiset`, and the later always have to find the lower/upper bounds by spec, I think that the situation there is the same -- i.e. they still do only one comparison per node without the ability to terminate the search early. – Yakov Galka May 03 '17 at 21:46
  • Ok thank you, I know now that it works. The C++ implementation stores pointers to the min and max nodes for O(1) access. The search can be implemented analogous to the lower_bound that you linked. – Jo So May 04 '17 at 17:24
  • You might want to look at my answer http://stackoverflow.com/a/43773235 which contains further interesting information. – Jo So May 04 '17 at 17:24