1

Suppose the number of hash table slots(say n) are proportional to the number of elements in the table(say m). We have n = O(m), load factor l = O(m)/m = O(1) So Under the assumption of Simple Uniform Hashing, Searching takes constant time on an average. Which means on an average searching takes time proportional to the length of the linked list which is same for all slots and hence constant time. But what about the worst case running time under the assumption of Simple Uniform Hashing. Is it also be constant or it'll O(1 + l). Please explain I'm confused. [Reference CLRS Page 260]

Does worst case time for Un-successful Search under the assumption of Simple uniform hashing will be same as average case time. And worst case time for successful Search under the assumption of Simple uniform hashing will be different than average case time.

Sonali
  • 531
  • 3
  • 7
  • 11
  • Uniform hashing is not enough to give you good worst case bounds. A hash family can be uniform while a specific function still hashes every key to the same bucket. If you can get hold of a universal hash function, you can get `O(logn)` bounds with high probability: http://stackoverflow.com/questions/4553624/hashmap-get-put-complexity/23954819#23954819 – Thomas Ahle May 30 '14 at 13:00

1 Answers1

4

Under the assumption of Simple Uniform Hashing (i.e. that a hypothetical hashing function will evenly distribute items into the slots of a hash table), I believe the worst-case performance for a lookup operation would the same as the average-case (for an unsuccessful lookup) - Θ(n/m + 1) (average case as per Wikipedia).

Why? Well, consider that, under the above assumption, each slot in the table will have the same number of elements in its chain. Because of this, both the average case and the worst case will involve looking through all the elements in any of the chains.

This is, of course, a pretty optimistic assumption - it practice we can rarely / never predetermine a hash function which will evenly distribute some unknown set of data (and we rarely build hash functions specifically for data sets), but, at the same time, we're unlikely to get to the true worst-case.

In general, the worst-case running time of a lookup or remove operation for a hash table using chaining is Θ(n).

In both cases, insert can still be implemented as Θ(1), since you can just insert at the front of the chain. That is, if we allow duplicates (as Jim mentioned), because, if not, we first have to check if it's already there (i.e. do a lookup).

The worst case happens when all the elements hash to the same value, thus you'd have one really long chain, essentially turning your data structure into a linked-list.

|--------|
|element1| -> element2 -> element3 -> element4 -> element5
|--------|
|  null  |
|--------|
|  null  |
|--------|
|  null  |
|--------|
|  null  |
|--------|
Community
  • 1
  • 1
Bernhard Barker
  • 50,899
  • 13
  • 85
  • 122
  • 1
    Insert can be implemented as O(1)? Only if you allow duplicates. – Jim Mischel Nov 27 '13 at 19:37
  • @JimMischel Oh, right. Thanks. I forgot about that. Added a note. – Bernhard Barker Nov 27 '13 at 19:49
  • "under the above assumption, each slot in the table will have the same number of elements in its chain" - completely wrong. that's not what Simple Uniform Hashing means. Just because the hash function evenly distributes the *whole* set of elements, it doesn't mean it will do that for a specific subset. Worst time, as you said, is O(n). – Karoly Horvath Dec 06 '13 at 09:31