196

New ES 6 (Harmony) introduces new Set object. Identity algorithm used by Set is similar to === operator and so not much suitable for comparing objects:

var set = new Set();
set.add({a:1});
set.add({a:1});
console.log([...set.values()]); // Array [ Object, Object ]

How to customize equality for Set objects in order to do deep object comparison? Is there anything like Java equals(Object)?

czerny
  • 11,433
  • 12
  • 57
  • 80
  • 4
    What do you mean by "customize equality"? Javascript does not allow for operator overloading so there is no way to overload the `===` operator. The ES6 set object does not have any compare methods. The `.has()` method and `.add()` method work only off it being the same actual object or same value for a primitive. – jfriend00 Apr 20 '15 at 22:33
  • 15
    By "customize equality" I mean any way how developer can define certain couple of objects to be considered equal or not. – czerny Apr 20 '15 at 22:46
  • Also https://stackoverflow.com/q/10539938/632951 – Pacerier Sep 19 '17 at 00:14
  • This [could be part of the *collection normalization* TC39 proposal](https://github.com/tc39/proposal-collection-normalization/issues/18) – Bergi Jul 28 '20 at 20:43

8 Answers8

117

The ES6 Set object does not have any compare methods or custom compare extensibility.

The .has(), .add() and .delete() methods work only off it being the same actual object or same value for a primitive and don't have a means to plug into or replace just that logic.

You could presumably derive your own object from a Set and replace .has(), .add() and .delete() methods with something that did a deep object comparison first to find if the item is already in the Set, but the performance would likely not be good since the underlying Set object would not be helping at all. You'd probably have to just do a brute force iteration through all existing objects to find a match using your own custom compare before calling the original .add().

Here's some info from this article and discussion of ES6 features:

5.2 Why can’t I configure how maps and sets compare keys and values?

Question: It would be nice if there were a way to configure what map keys and what set elements are considered equal. Why isn’t there?

Answer: That feature has been postponed, as it is difficult to implement properly and efficiently. One option is to hand callbacks to collections that specify equality.

Another option, available in Java, is to specify equality via a method that object implement (equals() in Java). However, this approach is problematic for mutable objects: In general, if an object changes, its “location” inside a collection has to change, as well. But that’s not what happens in Java. JavaScript will probably go the safer route of only enabling comparison by value for special immutable objects (so-called value objects). Comparison by value means that two values are considered equal if their contents are equal. Primitive values are compared by value in JavaScript.

jfriend00
  • 580,699
  • 78
  • 809
  • 825
  • 4
    Added article reference about this particular issue. It looks like the challenge is how to deal with an object that was the exact same as another at the time it was added to the set, but has now been changed and is no longer the same as that object. Is it in the `Set` or not? – jfriend00 Apr 20 '15 at 22:49
  • 4
    Why not implementing a simple GetHashCode or similar? – Jamby Sep 24 '16 at 15:12
  • @Jamby - That would be an interesting project to make a hash that handles all types of properties and hashes properties in the right order and deals with circular references and so on. – jfriend00 Sep 24 '16 at 20:32
  • 1
    @Jamby Even with a hash function you still have to deal with collisions. You're just deferring the equality problem. – mpen Feb 01 '17 at 22:03
  • 8
    @mpen That's not right, I'm allowing the developer to manage his own hash function for his specific classes which in almost every case prevent the collision problem since the developer knows the nature of the objects and can derive a good key. In any other case, fallback to current comparison method.[Lot](https://msdn.microsoft.com/en-us/library/system.object.gethashcode(v=vs.100).aspx) [of](https://en.wikipedia.org/wiki/Java_hashCode()) [languages](https://docs.ruby-lang.org/en/2.0.0/Hash.html) [already](https://docs.python.org/2/reference/datamodel.html#object.__hash__) do that, js not. – Jamby Feb 03 '17 at 15:26
  • Is this answer still accurate in 2017? – Jonah Jul 17 '17 at 17:52
  • @Jonah - As far as i know, this is still the case. Equality is based on being the actual same object. – jfriend00 Jul 17 '17 at 20:37
  • @jfriend00, Is it postponed to ES7 or ES8? – Pacerier Sep 18 '17 at 23:17
  • @Pacerier - I personally haven't seen any evidence that anyone is working on this. Equality for Javascript objects is based on being the actual same object, not based on property by property quality and maps and sets are based on equality. Plus, the same issues that they were concerned about in the original design (like objects being modified after the fact) are still problem issues. – jfriend00 Sep 19 '17 at 02:12
  • 1
    @Jamby coz js-programmers don't know about hash-code :( – Peter Dec 02 '18 at 22:26
  • I added an [answer below](https://stackoverflow.com/a/56353815/278488) explaining a simple solution using immutable-js (which also solves the mutability problem mentioned here). – Russell Davis May 29 '19 at 05:51
  • @Jamby, he's saying whenever you modify any key, the set needs to be refreshed. But I don't see why the contract needs to be so and practically speaking, no one has yet shot their toe this way in Java. – Pacerier Jun 28 '20 at 12:56
  • @Pacerier - Well, a Set is supposed to only ever contain one of any kind of item - no duplicates. If you could modify the object after adding it to the Set, thus turning it into a duplicate of something else in the Set, you'd have broken the contract for the Set. You may as well be using an Array or some other type of collection at that point since you can now contain multiple values that you have defined as equal which is not what a Set is. If you want to make the objects in the Set immutable, you can work-around that issue. – jfriend00 Jun 28 '20 at 17:02
  • I don't see how the possibility of mutation is any more of a problem then coding with objects ever is. I mean `const obj={}` does not make the object immutable, yet typical code will/in-practice-must often assume that a referenced object does not change between particular points in the program, even when those points are not in the same execution slice. And then there is always deep-copy to hedge bets, on a case-by-case basis. – Craig Hicks Jan 11 '21 at 20:16
  • @CraigHicks - The point is that if you added an object to a Set and the object was considered unique based on its content and the Set was only evaluating it based on content, not on the object reference itself (which is how they work today) and then you mutated the object to a point where it was no longer unique in the Set, then you'd violate the principle of the Set - it could contain duplicates. So, it being in a Set which is supposed to not have duplicates makes it different than other cases where you mutate an object. Any, it's all moot because a Set in JS doesn't work this way. – jfriend00 Jan 12 '21 at 00:14
  • @jfriend00 - Of course I see your point given the criterion that you are asserting (that hashes of object in the Set are immutable - equivalently that the objects themselves are immutable). What I am saying is that code is frequently written that depends on objects deep content (or at least portions of it) remaining constant between certain non trivially distant points in the code execution. Without trusting those assumptions, programming with JS (and other languages) would not be possible. That same trust, and the risks that trust entails can be extended to sets with deep objects. – Craig Hicks Jan 13 '21 at 01:52
  • @jfriend00 - If the application defines "equivalence" as being the same object (===) then it is trivial to add an uuid to each object and use that as the index. In the non-trivial case non-identical (!==) objects can be defined as "equivalent" by the application as needed. That definition of equivalence is application specific and it is therefore impossible to define it a priori at the language level. – Craig Hicks Jan 13 '21 at 02:19
  • @CraigHicks - Yep, and thus why Javascript only uses obj1 === obj2 as equivalence for a `Set`. As soon as you try to use some other definition of equivalence, all sorts of issues pop up. I'm not sure what you're arguing here. A uid does not solve the issue of two objects have identical properties, but not being the same actual object which is what this whole side discussion was about because their uid would not be the same as they aren't the same physical object. – jfriend00 Jan 13 '21 at 02:21
  • @jfriend00 - My argument is that your quote *"However, this approach is problematic for mutable objects: In general, if an object changes, its “location” inside a collection has to change, as well."* is irrelevant to the reason why a generalized Set for deep objects has not been implemented at the JS level. Focusing on an irrelevant reason takes attention away from the critical reasons. – Craig Hicks Jan 13 '21 at 03:03
30

As mentioned in jfriend00's answer customization of equality relation is probably not possible.

Following code presents an outline of computationally efficient (but memory expensive) workaround:

class GeneralSet {

    constructor() {
        this.map = new Map();
        this[Symbol.iterator] = this.values;
    }

    add(item) {
        this.map.set(item.toIdString(), item);
    }

    values() {
        return this.map.values();
    }

    delete(item) {
        return this.map.delete(item.toIdString());
    }

    // ...
}

Each inserted element has to implement toIdString() method that returns string. Two objects are considered equal if and only if their toIdString methods returns same value.

czerny
  • 11,433
  • 12
  • 57
  • 80
  • 2
    You could also have the constructor take a function that compares items for equality. This is good if you want this equality to be a feature of the set, rather than of the objects used in it. – Ben J Mar 02 '16 at 09:37
  • 3
    @BenJ The point of generating a string and put it in a Map is that in that way your Javascript engine will use a ~O(1) search in native code for searching the hash value of your object, while accepting an equality function would force to do a linear scan of the set and check every element. – Jamby Sep 24 '16 at 15:20
  • 3
    One challenge with this method is that it I think it assumes that the value of `item.toIdString()` is invariant and cannot change. Because if it can, then the `GeneralSet` can easily become invalid with "duplicate" items in it. So, a solution like that would be restricted to only certain situations likely where the objects themselves are not changed while using the set or where a set that becomes invalid is not of consequence. All of these issues probably further explain why the ES6 Set does not expose this functionality because it really only works in certain circumstances. – jfriend00 Jan 20 '17 at 22:02
  • Is it possible to add the correct implementation of `.delete()` to this answer? – jlewkovich Jan 13 '19 at 17:02
  • 1
    @JLewkovich sure – czerny Jan 14 '19 at 01:31
  • The problem with this answer is that (from outward appearance) `item.toIdString()` computes the id string independent of the contents of the instance of the GeneralSet into which it will be inserted. That precludes the possibility of a hash function - therefore validating your statement about being "memory expensive". Passing the GeneralSet as a parameter - `item.toIdString(gs:GeneralSet)` enables hashes to be used. Practically speaking that's the only way to do it in the "general" case (due to memory limitations) although it is obviously more work to manage the hashing. – Craig Hicks Jan 13 '21 at 02:49
  • Actually, I take back the statement that the general set "must" be checked for collisions. With a suitable `toIdString()` string function and and a suitable hash function `hashOfIdString()`, the chance of collision is sufficiently low to that it may be ignored. And the memory usage is low - making your statement about "memory expensive" be incorrect. – Craig Hicks Jan 13 '21 at 03:20
14

As the top answer mentions, customizing equality is problematic for mutable objects. The good news is (and I'm surprised no one has mentioned this yet) there's a very popular library called immutable-js that provides a rich set of immutable types which provide the deep value equality semantics you're looking for.

Here's your example using immutable-js:

const { Map, Set } = require('immutable');
var set = new Set();
set = set.add(Map({a:1}));
set = set.add(Map({a:1}));
console.log([...set.values()]); // [Map {"a" => 1}]
Russell Davis
  • 7,437
  • 4
  • 36
  • 41
5

To add to the answers here, I went ahead and implemented a Map wrapper that takes a custom hash function, a custom equality function, and stores distinct values that have equivalent (custom) hashes in buckets.

Predictably, it turned out to be slower than czerny's string concatenation method.

Full source here: https://github.com/makoConstruct/ValueMap

Community
  • 1
  • 1
mako
  • 862
  • 8
  • 22
  • “string concatenation”? Isn’t his method more like “string surrogating” (if you’re going to give it a name)? Or is there a reason you use the word “concatenation”? I’m curious ;-) – binki Jun 25 '17 at 01:54
  • @binki This is a good question and I think the answer brings up a good point that it took me a while to grasp. Typically, when computing a hash code, one does something like [HashCodeBuilder](http://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/builder/HashCodeBuilder.html) which multiplies the hash codes of individual fields and is not guaranteed to be unique (hence the need for a custom equality function). However, when generating an id string you concatenate the id strings of individual fields which IS guaranteed to be unique (and thus no equality function needed) – Pace Oct 17 '18 at 12:29
  • 1
    So if you have a `Point` defined as `{ x: number, y: number }` then your `id string` is probably `x.toString() + ',' + y.toString()`. – Pace Oct 17 '18 at 12:30
  • Making your equality comparison build some value which is guaranteed to vary only when things should be considered non-equal is a strategy I have used before. It’s easier to think about things that way sometimes. In that case, you’re generating **keys** rather than **hashes**. As long as you have a key deriver which outputs a key in a form that existing tools support with value-style equality, which almost always ends up being `String`, then you can skip the whole hashing and bucketing step as you said and just directly use a `Map` or even old-style plain object in terms of the derived key. – binki Oct 17 '18 at 15:45
  • 2
    One thing to be careful of if you actually use string concatenation in your implementation of a key deriver is that string properties may need to be treated special if they are allowed to take on any value. For example, if you have `{x: '1,2', y: '3'}` and `{x: '1', y: '2,3'}`, then `String(x) + ',' + String(y)` will output the same value for both objects. A safer option, assuming you can count on `JSON.stringify()` being deterministic, is to take advantage of its string escaping and use `JSON.stringify([x, y])` instead. – binki Oct 17 '18 at 15:58
  • Nice! Your map is more generally applicable than the czerny's approach, so it's fine when it's slower. What's strange is that you use the hash itself as a key, which makes your buckets to have nearly always length of one. Your benchmark compares two semantically different things as your `moahash` is not injective, i.e., it can't be used as `toIdString`. – maaartinus Mar 31 '20 at 01:34
4

Comparing them directly seems not possible, but JSON.stringify works if the keys just were sorted. As I pointed out in a comment

JSON.stringify({a:1, b:2}) !== JSON.stringify({b:2, a:1});

But we can work around that with a custom stringify method. First we write the method

Custom Stringify

Object.prototype.stringifySorted = function(){
    let oldObj = this;
    let obj = (oldObj.length || oldObj.length === 0) ? [] : {};
    for (let key of Object.keys(this).sort((a, b) => a.localeCompare(b))) {
        let type = typeof (oldObj[key])
        if (type === 'object') {
            obj[key] = oldObj[key].stringifySorted();
        } else {
            obj[key] = oldObj[key];
        }
    }
    return JSON.stringify(obj);
}

The Set

Now we use a Set. But we use a Set of Strings instead of objects

let set = new Set()
set.add({a:1, b:2}.stringifySorted());

set.has({b:2, a:1}.stringifySorted());
// returns true

Get all the values

After we created the set and added the values, we can get all values by

let iterator = set.values();
let done = false;
while (!done) {
  let val = iterator.next();

  if (!done) {
    console.log(val.value);
  }
  done = val.done;
}

Here's a link with all in one file http://tpcg.io/FnJg2i

relief.melone
  • 2,289
  • 1
  • 18
  • 39
2

Maybe you can try to use JSON.stringify() to do deep object comparison.

for example :

const arr = [
  {name:'a', value:10},
  {name:'a', value:20},
  {name:'a', value:20},
  {name:'b', value:30},
  {name:'b', value:40},
  {name:'b', value:40}
];

const names = new Set();
const result = arr.filter(item => !names.has(JSON.stringify(item)) ? names.add(JSON.stringify(item)) : false);

console.log(result);
GuaHsu
  • 263
  • 1
  • 8
  • 2
    This can work but doesnt have to as JSON.stringify({a:1,b:2}) !== JSON.stringify({b:2,a:1}) If all objects are created by your program in the same order you're safe. But not a really safe solution in general – relief.melone Nov 18 '18 at 10:48
  • 1
    Ah yes, "convert it to a string". Javascript's answer for everything. – Timmmm Dec 20 '19 at 14:59
2

For Typescript users the answers by others (especially czerny) can be generalized to a nice type-safe and reusable base class:

/**
 * Map that stringifies the key objects in order to leverage
 * the javascript native Map and preserve key uniqueness.
 */
abstract class StringifyingMap<K, V> {
    private map = new Map<string, V>();
    private keyMap = new Map<string, K>();

    has(key: K): boolean {
        let keyString = this.stringifyKey(key);
        return this.map.has(keyString);
    }
    get(key: K): V {
        let keyString = this.stringifyKey(key);
        return this.map.get(keyString);
    }
    set(key: K, value: V): StringifyingMap<K, V> {
        let keyString = this.stringifyKey(key);
        this.map.set(keyString, value);
        this.keyMap.set(keyString, key);
        return this;
    }

    /**
     * Puts new key/value if key is absent.
     * @param key key
     * @param defaultValue default value factory
     */
    putIfAbsent(key: K, defaultValue: () => V): boolean {
        if (!this.has(key)) {
            let value = defaultValue();
            this.set(key, value);
            return true;
        }
        return false;
    }

    keys(): IterableIterator<K> {
        return this.keyMap.values();
    }

    keyList(): K[] {
        return [...this.keys()];
    }

    delete(key: K): boolean {
        let keyString = this.stringifyKey(key);
        let flag = this.map.delete(keyString);
        this.keyMap.delete(keyString);
        return flag;
    }

    clear(): void {
        this.map.clear();
        this.keyMap.clear();
    }

    size(): number {
        return this.map.size;
    }

    /**
     * Turns the `key` object to a primitive `string` for the underlying `Map`
     * @param key key to be stringified
     */
    protected abstract stringifyKey(key: K): string;
}

Example implementation is then this simple: just override the stringifyKey method. In my case I stringify some uri property.

class MyMap extends StringifyingMap<MyKey, MyValue> {
    protected stringifyKey(key: MyKey): string {
        return key.uri.toString();
    }
}

Example usage is then as if this was a regular Map<K, V>.

const key1 = new MyKey(1);
const value1 = new MyValue(1);
const value2 = new MyValue(2);

const myMap = new MyMap();
myMap.set(key1, value1);
myMap.set(key1, value2); // native Map would put another key/value pair

myMap.size(); // returns 1, not 2
Jan Dolejsi
  • 1,007
  • 9
  • 23
-3

To someone who found this question on Google (as me) wanting to get a value of a Map using an object as Key:

Warning: this answer will not work with all objects

var map = new Map<string,string>();

map.set(JSON.stringify({"A":2} /*string of object as key*/), "Worked");

console.log(map.get(JSON.stringify({"A":2}))||"Not worked");

Output:

Worked

WiseTap
  • 3,383
  • 1
  • 16
  • 24