60

Is it a fair assumption that in v8 implementation retrieval / lookup is O(1)?

(I know that the standard doesn't guarantee that)

Manohar Reddy Poreddy
  • 16,412
  • 7
  • 111
  • 98
Uri
  • 20,857
  • 6
  • 38
  • 61
  • 5
    On average? Or worst case? – Oriol Nov 09 '15 at 15:24
  • The standard [does guarantee sub-linear complexity](http://stackoverflow.com/a/31092145/1048572), btw. – Bergi Nov 09 '15 at 15:25
  • @Oriol both would be interesting to know. – Uri Nov 09 '15 at 15:33
  • Not sure what V8 uses, but the [spec](http://www.ecma-international.org/ecma-262/6.0/index.html#sec-map-objects) mentions hash tables as a possibility. [Hash tables](https://en.wikipedia.org/wiki/Hash_table) have constant time operations on average, and linear on worst case. – Oriol Nov 09 '15 at 15:39
  • Another good answer to this can be found [here](http://stackoverflow.com/questions/34328136/are-javascript-map-objects-indexed-to-optimize-map-get) – Lonnie Best Dec 20 '15 at 12:31

5 Answers5

63

Is it a fair assumption that in v8 implementation retrieval / lookup is O(1)?

Yes. V8 uses a variant of hash tables that generally have O(1) complexity for these operations.

For details, you might want to have a look at https://codereview.chromium.org/220293002/ where OrderedHashTable is implemented based on https://wiki.mozilla.org/User:Jorend/Deterministic_hash_tables.

Bergi
  • 513,640
  • 108
  • 821
  • 1,164
  • 4
    To extend a bit on that, Map and Set in V8 were recently reimplemented in JavaScript https://codereview.chromium.org/947683002 This is common in V8, implementing new features in JavaScript let the JIT (Crankshaft/Turbofan) optimize runtime code. – Diego Pino Nov 09 '15 at 17:27
  • @DiegoPino: Thanks. I somehow thought the `OrderedHashTable` implementation was still current because I found it in the [trunk](https://code.google.com/p/v8/source/browse/trunk/src/objects.cc)… – Bergi Nov 09 '15 at 17:36
  • 1
    Looks like the implementation has since moved from `src/collections.js` to `src/builtins/builtins-collections-gen.cc` https://github.com/v8/v8/blob/master/src/builtins/builtins-collections-gen.cc#L1424 – protoEvangelion Jan 26 '19 at 23:04
20

For people don't want to dig into the rabbit hole too deep:

1: We can assume good hash table implementations have practically O(1) time complexity.
2: Here is a blog posted by V8 team explains how some memory optimization was done on its hashtable implementation for Map, Set, WeakSet, and WeakMap: Optimizing hash tables: hiding the hash code

Based on 1 and 2: V8's Set and Map's get & set & add & has time complexity practically is O(1).

David Guan
  • 3,702
  • 21
  • 29
4

The original question is already answered, but O(1) doesn't tell a lot about how efficient the implementation is.

First of all, we need to understand what variation of hash table is used for Maps. "Classical" hash tables won't do as they do not provide any iteration order guarantees, while ES6 spec requires insertion to be in the iteration order. So, Maps in V8 are built on top of so-called deterministic hash tables. The idea is similar to the classical algorithm, but there is another layer of indirection for buckets and all entries are inserted and stored in a contiguous array of a fixed size. Deterministic hash tables algorithm indeed guarantees O(1) time complexity for basic operations, like set or get.

Next, we need to know what's the initial size of the hash table, the load factor, and how (and when) it grows/shrinks. The short answer is: initial size is 4, load factor is 2, the table (i.e. the Map) grows x2 as soon as its capacity is reached, and shrinks as soon as there is more than 1/2 of deleted entries.

Let’s consider the worst-case when the table has N out of N entries (it's full), all entries belong to a single bucket, and the required entry is located at the tail. In such a scenario, a lookup requires N moves through the chain elements.

On the other hand, in the best possible scenario when the table is full, but each bucket has 2 entries, a lookup will require up to 2 moves.

It is a well-known fact that while individual operations in hash tables are “cheap”, rehashing is not. Rehashing has O(N) time complexity and requires allocation of the new hash table on the heap. Moreover, rehashing is performed as a part of insertion or deletion operations, when necessary. So, for instance, a map.set() call could be more expensive than you would expect. Luckily, rehashing is a relatively infrequent operation.

Apart from that, such details as memory layout or hash function also matter, but I'm not going to go into these details here. If you're curious how V8 Maps work under the hood, you may find more details here. I was interested in the topic some time ago and tried to share my findings in a readable manner.

3
let map = new Map();
let obj = {};

const benchMarkMapSet = size => {
console.time("benchMarkMapSet");
for (let i = 0; i < size; i++) {
    map.set(i, i);
}
console.timeEnd("benchMarkMapSet");
};

const benchMarkMapGet = size => {
console.time("benchMarkMapGet");
for (let i = 0; i < size; i++) {
    let x = map.get(i);
}
console.timeEnd("benchMarkMapGet");
};

const benchMarkObjSet = size => {
console.time("benchMarkObjSet");
for (let i = 0; i < size; i++) {
    obj[i] = i;
}
console.timeEnd("benchMarkObjSet");
};

const benchMarkObjGet = size => {
console.time("benchMarkObjGet");
for (let i = 0; i < size; i++) {
    let x = obj[i];
}
console.timeEnd("benchMarkObjGet");
};

let size = 2e6;

benchMarkMapSet(size);
benchMarkObjSet(size);
benchMarkMapGet(size);
benchMarkObjGet(size);

benchMarkMapSet: 382.935ms benchMarkObjSet: 76.077ms benchMarkMapGet: 125.478ms benchMarkObjGet: 2.764ms

benchMarkMapSet: 373.172ms benchMarkObjSet: 77.192ms benchMarkMapGet: 123.035ms benchMarkObjGet: 2.638ms

0

Why don't we just test.

var size_1 = 1000,
    size_2 = 1000000,
    map_sm = new Map(Array.from({length: size_1}, (_,i) => [++i,i])),
    map_lg = new Map(Array.from({length: size_2}, (_,i) => [++i,i])),
    i      = size_1,
    j      = size_2,
    s;

s = performance.now();
while (i) map_sm.get(i--);
console.log(`Size ${size_1} map returns an item in average ${(performance.now()-s)/size_1}ms`);
s = performance.now();
while (j) map_lg.get(j--);
console.log(`Size ${size_2} map returns an item in average ${(performance.now()-s)/size_2}ms`);

So to me it seems like it converges to O(1) as the size grows. Well lets call it O(1) then.

Redu
  • 19,106
  • 4
  • 44
  • 59