2

Objective : Create a hash map that takes 2 integer keys(pointers converted to integers using unsigned int casting, and yes this works) and maps it to a single value.

Attempted Solution: So i already have a hash map that takes a single key and maps it to value successfully. I now extended it to taking two keys using a "pairing function". So i take the two keys , pair them using the Cantor pairing function and then hash this combined key .

Bottleneck: So the problem with two keys is that the cantor pairing function does a multiplication which causes integer overflow and hence "does not" give me unique outputs, as it is supposed to do mathematically.

Question:

  1. I see that a lot of hashing functions do multiplications. Is integer overflow a normal thing in hashing or is this bad?
  2. Im also thinking of doing an append of one key on the other into a new 64 bit integer. like aaaaaaaabbbbbbbb and then pass it on to the hash map. But I fear that this might cause abnormal numbers like NaN to come up due to the floating point representation, which could be bad.
  3. Any better ideas are welcome.

Please let me know if some parts are unclear.

ulidtko
  • 12,505
  • 10
  • 49
  • 82
Kshitij Banerjee
  • 1,420
  • 1
  • 16
  • 33
  • 1
    *Mathematically* these kind of things assume that your integers are of unlimited precision, that's why it doesn't work so easily right away. – ulidtko Dec 27 '12 at 09:26

2 Answers2

3

you might want to have a look at boost::hash_combine

gvd
  • 1,755
  • 12
  • 15
  • So i cant use a third party library. I need to create one of my own, if you can throw some light on the internals, that'l be great! – Kshitij Banerjee Nov 05 '12 at 07:09
  • 1
    Have a look at the boost code. The combine method is very simple. You can just use that directly http://www.boost.org/doc/libs/1_51_0/doc/html/hash/reference.html#boost.hash_combine – gvd Nov 05 '12 at 07:11
1
  1. Integer overflow is not that bad. True it can cause collisions, but its ok to have rare collisions for hashes meant for a hashmap.

  2. Perhaps a bad idea. It can cause too many collisions.

  3. If your inputs are N bits wide, then your output will have to be at least 2N bits wide. So to accommodate uint inputs, you need an output of ulong size. If input is higher than that it will cause overflow. If output is smaller than that then there will be collisions/duplicates.

But true there aren't many functions which fit inside 2N size. You could try

a * uint.MaxValue + b

Or this

a >= b ? a * a + a + b : a + b * b

both of which will be an unsigned long. These two are the most space efficient it can get.

See Mapping two integers to one, in a unique and deterministic way additionally.

Community
  • 1
  • 1
nawfal
  • 62,042
  • 48
  • 302
  • 339