75

I would like to know the complexity in Big O notation of the STL multiset, map and hash map classes when:

  • inserting entries
  • accessing entries
  • retrieving entries
  • comparing entries
gsamaras
  • 66,800
  • 33
  • 152
  • 256
  • 2
    this is actually my post and I cannot understand why i appear inactive and thus cannot change it... – Harry Jul 25 '09 at 16:29

1 Answers1

109

map, set, multimap, and multiset

These are implemented using a red-black tree, a type of balanced binary search tree. They have the following asymptotic run times:

Insertion: O(log n)
Lookup: O(log n)
Deletion: O(log n)

hash_map, hash_set, hash_multimap, and hash_multiset

These are implemented using hash tables. They have the following runtimes:

Insertion: O(1) expected, O(n) worst case
Lookup: O(1) expected, O(n) worst case
Deletion: O(1) expected, O(n) worst case

If you use a proper hash function, you'll almost never see the worst case behavior, but it is something to keep in mind — see Denial of Service via Algorithmic Complexity Attacks by Crosby and Wallach for an example of that.

Adam Rosenfield
  • 360,316
  • 93
  • 484
  • 571
  • 4
    Does all what you say on `hash_*` refer to C++11 unordered and Boost.Unordered containers? – myWallJSON Dec 11 '11 at 10:15
  • 3
    The `hash_*` class templates are part of [Silicon Graphics STL](https://www.sgi.com/tech/stl/). These were incorporated into the C++11 revision under `unordered_*` names (unordered_map, unordered_set, etc.) Also, they have been included into libstdc++, Visual C++, and Boost C++ libraries. – milpita Sep 19 '16 at 21:24
  • 1
    @CEOatApartico: Fixed the dead link – Adam Rosenfield Nov 29 '18 at 04:24
  • I don't understand the "expected O and worst-case O". Big-O is by definition "worst case". – Paulius Liekis Jan 02 '21 at 14:06
  • @PauliusLiekis You don't know what you are talking about. Big-O is, by definition, "upper bound", which has nothing to do with worst case, avg. case, best case. – ypnos Jan 03 '21 at 10:25
  • To explain it: The Landau notation describes growth of a function. If you discern between cases, like in the hashmap, you deal with different complexity functions, and each of them have their own growth and limiting. – ypnos Jan 03 '21 at 10:34
  • @ypnos yes, I admit - I do not understand. It's actually bothering me :) But I do not understand your explanation either :/ let me rephrase the question: let's say we have std::hash_map - I can construct such an object where all keys will live in the same bucket, thus finding entries will take O(N) or O(log N) depending on implementation. So how can one claim that finding entries is O(1)? I honestly want to understan. – Paulius Liekis Jan 03 '21 at 14:17
  • I see the rationale as this. Given a hash map with appropriate hash function and size, you _expect_ the bucket size to not grow with n and obtain the average case of O(1). Picking an appropriate hash function is the developer's responsibility. To guarantee appropriate size (load factor), rehashing may be triggered on insertion and we obtain worst case O(n)+O(1) = O(n). Landau symbols do not cover such a distinction between two different algorithms! So all you have left is to specify two different measures for average and worst case, and both may use O(), o(), Ө(), etc. – ypnos Jan 05 '21 at 09:56