4

I'm trying to determine a key for map<double, double> type. But the problem is that the key I want will be generated by a pair of 2 numbers. Are there any good functions which could generate such key for pairs like (0, 1), (2, 3), (4, 2) (0, 2), etc.

Brian Tompsett - 汤莱恩
  • 5,195
  • 62
  • 50
  • 120
Chen Li
  • 151
  • 1
  • 2
  • 8

2 Answers2

6

Go for N'ary numerical system, where N is the maximum possible value of the number in pair.

Like this:

hash(a, b) = a + b * N

then

a = hash(a, b) % N
b = hash(a, b) / N

This will guarantee that for every pair (a, b) there is its own unique hash(a, b). Same things happens to numbers in decimal: imagine all numbers from 0 (we write them as 00, 01, 02, ...) to 99 inclusive are your pairs ab. Then, hash(a, b) = a * 10 + b, and visa-versa, to obtain first digit you have to divide the number by 10, second - get it modulo 10.

Why can't we pick any N, maybe smaller than the maximum of a/b? The answer is: to avoid collision.
If you pick any number and it happens to be smaller than your maximum number, it is highly possible that same hash function will be provided by different pairs of numbers. For example, if you pick N = 10 for pairs: (10, 10) and (0, 11), both their hashes will be equal to 110, which is not good for you in this situation.

pravy mravec
  • 180
  • 12
dreamzor
  • 5,645
  • 4
  • 36
  • 58
0

You should ideally have a KeyValuePair<int, int> as your key. I don't think writing more code than that can be helpful. If you cant have that for some reason, then hashing the pair to give a single key depends on what you're trying to achieve. If hashes are meant for hash structures like Dictionary, then you have to balance collision rate and speed of hashing. To have a perfect hash without collision at all it will be more time consuming. Similarly the fastest hashing algorithm will have more collisions relatively. Finding the perfect balance is the key here. Also you should take into consideration how large your effective hash can be and if hashed output should be reversible to give you back the original inputs. Typically priority should be given to speed up pairing/hashing/mapping than minimizing collision probability (a good hash algorithm will have less collision chances). To have perfect hashes you can see this thread for a plethora of options..

Community
  • 1
  • 1
nawfal
  • 62,042
  • 48
  • 302
  • 339