As an experiment, the hashes in python2
and python3
seem to be different:
alvas@ubi:~$ python -c "from collections import Counter; x = Counter({'foo': 1, 'bar': 1, 'foobar': 1, 'barfoo': 1}); print(x.most_common())"
[('foobar', 1), ('foo', 1), ('bar', 1), ('barfoo', 1)]
alvas@ubi:~$ python -c "from collections import Counter; x = Counter({'foo': 1, 'bar': 1, 'foobar': 1, 'barfoo': 1}); print(x.most_common())"
[('foobar', 1), ('foo', 1), ('bar', 1), ('barfoo', 1)]
alvas@ubi:~$ python -c "from collections import Counter; x = Counter({'foo': 1, 'bar': 1, 'foobar': 1, 'barfoo': 1}); print(x.most_common())"
[('foobar', 1), ('foo', 1), ('bar', 1), ('barfoo', 1)]
alvas@ubi:~$ python3 -c "from collections import Counter; x = Counter({'foo': 1, 'bar': 1, 'foobar': 1, 'barfoo': 1}); print(x.most_common())"
[('barfoo', 1), ('foobar', 1), ('bar', 1), ('foo', 1)]
alvas@ubi:~$ python3 -c "from collections import Counter; x = Counter({'foo': 1, 'bar': 1, 'foobar': 1, 'barfoo': 1}); print(x.most_common())"
[('foo', 1), ('barfoo', 1), ('bar', 1), ('foobar', 1)]
alvas@ubi:~$ python3 -c "from collections import Counter; x = Counter({'foo': 1, 'bar': 1, 'foobar': 1, 'barfoo': 1}); print(x.most_common())"
[('bar', 1), ('barfoo', 1), ('foobar', 1), ('foo', 1)]
And when we look at string hashes, python3
hashes seems to be dynamic:
alvas@ubi:~$ python -c "print 'abc'.__hash__()"
1453079729188098211
alvas@ubi:~$ python -c "print 'abc'.__hash__()"
1453079729188098211
alvas@ubi:~$ python -c "print 'abc'.__hash__()"
1453079729188098211
alvas@ubi:~$ python3 -c "print ('abc'.__hash__())"
-4165906745021293940
alvas@ubi:~$ python3 -c "print ('abc'.__hash__())"
-4676677077013862663
alvas@ubi:~$ python3 -c "print ('abc'.__hash__())"
5261896652811750722
My question is why and how is the hash different?
Which hashing algorithm is each one of them using? Where are I find the exact CPython code where the string hashing happens?
Is there a way to unrandomize the hashes?
EDITED
After reading the PEP398 , this can unset the random hash but it's not recommended due to security issues.
alvas@ubi:~$ export PYTHONHASHSEED=0
alvas@ubi:~$ python3 -c "print ('abc'.__hash__())"
4596069200710135518
alvas@ubi:~$ python3 -c "print ('abc'.__hash__())"
4596069200710135518
alvas@ubi:~$ python3 -c "print ('abc'.__hash__())"
4596069200710135518