0

I'm still learning how to use the standard library containers std::map and std::unordered_map. I have recently tried to provide a hash function or comparison function.

There are two things I don't understand:

std::map<int, int, equal_to<int>> myMap1; // Works
std::map<int, int, hash<int>> myMap2; // Works

std::unordered_map<int, int, equal_to<int>> myMap3; // Doesn't work
std::unordered_map<int, int, hash<int>> myMap4; // Works

Why can I provide a hash function for std::map while I can't provide a comparison function to a std::unordered_map?

Also, if I use a custom struct for the comparison function std::equal_to without overloading the == operator I get no errors. Why is that so?

Benjamin Bannier
  • 45,636
  • 10
  • 55
  • 78
Benoît Dubreuil
  • 564
  • 7
  • 22

2 Answers2

2

There is a thing in Standard C++ Library called "concept". Concept is a set of requirements. By using std::unordered_map (or anything that follows a concept, for that matter) you agree that you will adhere to the requirements it states.

By using

std::map<int, int, equal_to<int>> myMap1;
std::map<int, int, hash<int>> myMap2;
std::unordered_map<int, int, equal_to<int>> myMap3;

you violate requirements both for std::map and std::unordered_map - compiler isn't required to check for your mistakes, and myMap1, myMap2 and myMap3 definitely won't work properly.

It just happens that Hash requires that it returns value of type std::size_t. This can be easily checked.

This is different to Compare that state requirements that cannot easily be checked (for example, std::size_t is convertible to bool) - therefore, its usage will most likely fail during runtime.

Also, it seems that you are confused about purpose of hash and comparison functions - they are different.

Hash returns an value of type std::size_t that for any values a, b of type T

Hash(a) != Hash(b) implies a != b
and
a == b implies Hash(a) == Hash(b)

That's it - nothing more, nothing less. It should be faster than a comparison function, so you can easily compare values - if hashes are different, then the values are definitely different, and if hashes are the same, do the full comparison.

milleniumbug
  • 14,714
  • 3
  • 43
  • 70
  • And by the way, if I have a huge struct with like 40 ints in it, how can I combine them into a size_t? Won't there be duplicates? – Benoît Dubreuil May 28 '14 at 19:51
  • Oh never mind. I think I get it now. The hash table in a certain way regroup the keys with the same size_t into buckets (or something like that?). – Benoît Dubreuil May 28 '14 at 19:52
  • 1
    @user2924010: Usually, you hash each int, and use a `hash_combine` to combine them into a single hashvalue. It's ok if two structures have the same hash, but the less it happens the faster the container works. – Mooing Duck May 28 '14 at 19:52
  • Hmmm I'm quite lost here. On [link](www.cplusplus.com), this is the description of size_t: It is a type able to represent the size of any object in bytes: size_t is the type returned by the sizeof operator and is widely used in the standard library to represent sizes and counts. – Benoît Dubreuil May 28 '14 at 19:57
  • So, if I use a integer key, the size_t will always be the same? Or the std::hash<> generate a "random" size_t? – Benoît Dubreuil May 28 '14 at 19:58
  • 1
    @user2924010 You can't simply generate "random" numbers (also, define what do you mean by "random"), because if you simply generate "random" numbers, `Hash(5) == Hash(5)` won't be true anymore (as required by `a == b implies Hash(a) == Hash(b)`) – milleniumbug May 28 '14 at 20:08
  • I meant an unknown algorithm generating a size_t based on a pattern. – Benoît Dubreuil May 28 '14 at 21:22
1

Because the two structures are very different:

  • A hash function makes no sense for a map. A map expects an optional comparison function (it is an ordered structure), your first two examples are not passing a comparison function. It default to less.

To be specific, it is

Binary predicate that, taking two element keys as argument, returns true if the first argument goes before the second argument in the strict weak ordering it defines, and false otherwise. This shall be a function pointer or a function object. Member type key_compare is the internal comparison object type used by the container, defined in map as an alias of its third template parameter (Compare). If key_compare uses the default less (which has no state), this parameter is not relevant.

  • An unordered_map is basically a hashtable (it has no ordered state) and expect a hash function, you are not passing it a hash function.

Comparison function object, that returns true if the two container object keys passed as arguments are to be considered equal. Member type key_equal is defined in unordered_map as an alias of its fourth template parameter (Pred).

Quotes from http://www.cplusplus.com

quantdev
  • 22,595
  • 5
  • 47
  • 84
  • How would I define a hash function then? I have no single idea how. – Benoît Dubreuil May 28 '14 at 19:38
  • For a custom hash function, it has been discussed extensively on SO, for example look here: http://stackoverflow.com/questions/15869066/inserting-into-unordered-set-with-custom-hash-function – quantdev May 28 '14 at 19:40
  • The problem is that it is really complex for me. I have found many posts on stackoverflow about hash functions, but never a basic tutorial for beginner like me. For example, can two different variables return the same size_t for a hash function? – Benoît Dubreuil May 28 '14 at 19:43