My understanding of hash tables is that they use hash functions to relate keys to locations in memory, with a total number of "buckets" pre-allocated in memory. The goal is for there to be enough buckets that I don't have to use chaining, slowing my ideal O(1)
access time complexity to n/m x O(1)
where n is the number of unique keys to store, and m is the number of buckets.
So if I have 1000 unique items to store, I'll want no less than 1000 buckets, and perhaps a lot more to minimize probability of having to use my chained linked list. If this weren't the case, we'd expect the average hash table to have many, many collisions. Now if we've got 1000 pre-allocated buckets, that means I have 1000 bytes of allocated memory, distributed around my memory. Thus every single unique key in my hash table results in a fragment of memory, fragmenting my RAM.
Does this mean that the use of hash tables is basically guaranteed to result in an amount of fragmentation proportional to the number of unique keys? Further, this seems to indicate that you can greatly minimize fragmentation using some statistics to pick the number of buckets, if you know how many unique keys there are going to be. Is this the case?