I want to build a Scrabble Cheater which stores Strings in an array of linked lists. In a perfect scenario each linked list would only have words with the same permutations of letters (like POOL and LOOP for example). The user would put a String in there like OLOP and the linked list would be printed out.
I want the task to be explicitly solved using hashing.
I have built a stringHashFunction() for that (Java code):
public int stringHashFunction(String wordToHash) {
int hashKeyValue = 7;
//toLowerCase and sort letters in alphabetical order
wordToHash = normalize(wordToHash);
for(int i = 0; i < wordToHash.length(); i++) {
int charCode = wordToHash.charAt(i) - 96;
//calculate the hash key using the 26 letters
hashKeyValue = (hashKeyValue * 26 + charCode) % hashTable.length;
}
return hashKeyValue;
}
Does it look like an OK-hash-function? I realize that it's far from a perfect hash but how could I improve that?
My code overall works but I have the following statistics for now:
- Number of buckets: 24043
- All items: 24043
- The biggest bucket counts: 11 items.
- There are: 10264 empty buckets
- On average there are 1.7449016619493432 per bucket.
Is it possible to avoid the collisions so that I only have buckets (linked lists) with the same permutations? I think if you have a whole dictionary in there it might be useful to have that so that you don't have to run an isPermutation() method on each bucket every time you want to get some possible permutations on your String.