1
#include <unordered_set>
#include <iostream>

class edge{
    public:
        float a1;
        float a2;

};

struct comp{
    bool operator()(const edge& e1, const edge& e2) const {
        return true;
        return (
            (e1.a1==e2.a1 && e1.a2==e2.a2) ||
            (e1.a1==e2.a2 && e1.a2==e2.a1)
        );
    };
};
struct hash{
    size_t operator()(const edge& e1) const {
        // return std::hash<float>()(e1.a1+e1.a2);
        return std::hash<float>()(e1.a1+e1.a2*2);
    };
};


int main() {
    std::unordered_set<edge,hash,comp> s1;
    s1.insert(edge{1.1,2.2});
    s1.insert(edge{2.2,1.1});
    for( auto& it : s1 ) {
        std::cout << it.a1 << " " << it.a2 << "\n";
    }
    std::cout << "s1.size " << s1.size() << "\n";
}

I realize that if different element has same hash value, then they are considered equal, but I just want this unordered_set use comparator that I define, just ignore hash?

How to achieve that?

I know i can use set, but using set need to consider order, if a < b is true and b < a is also true, then this element will not be inserted successfully, Sometimes, It is hard to provide order.

If anyone can help, much appreciated


edited: My intention is to let two edges called e1,e2, they are same if (e1.a1==e2.a1&&e1.a2==e2.a2)or(e1.a1==e2.a2 && e1.a2==e2.a1) as I provided in struct comp. but when i test. it seems hash function can change the comparison too. Someone says the way I define hash and comparator result in undefined behaviour. Is that true? why? if true, how to solve this? I just want comparator decide which one is satisfied to be inserted in unordered_set without duplicate. And really do not care about hash.

BTW, thanks for some many people replying

Athos
  • 115
  • 1
  • 10
  • Does `std::unordered_multiset`fit your requirements? – Mestkon Dec 03 '19 at 11:40
  • 1
    Ordering is usually not hard if you use `std::tie`. – Ted Lyngmo Dec 03 '19 at 11:45
  • 1
    Your intention is to have `size() == 1`? – Evg Dec 03 '19 at 11:46
  • 2
    *"I realize that if different element has same hash value, then they are considered equal"* -not true. The point of unordered set hash+comparator is to provide constant time hashing and linear subtime (k) comparison. A unordered_set is still a *set* (crazy, right?). The elements are unique. That uniqueness is determined by finding the proper collision bucket in the current hash table via the `hash`, then searching for a matching element in the collision bucket via `compare`. If that isn't the behavior you want, then stop trying to shove a square peg in a round hole; use a different container. – WhozCraig Dec 03 '19 at 11:47
  • I believe it's not much faster to compare hashes than to compare a pair of `float`s. And your hash looks kinda weak anyway. So I'd consider using `std::set` instead. – Denis Sheremet Dec 03 '19 at 11:52
  • @Evg yeah, i want size() == 1 – Athos Dec 03 '19 at 12:35

2 Answers2

3

If you want to handle edge.a1 and edge.a2 interchangeably, you have to implement a hashing function that returns the same value even when they are swapped. I advise against using addition, because addition may not be commutative for floats, but you could sort them by size and combine the hashes afterwards:

struct hash {
  size_t operator()(const edge& e1) const {
    auto h1 = std::hash<float>{}(std::min(e1.a1, e1.a2));
    auto h2 = std::hash<float>{}(std::max(e1.a1, e1.a2));
    return h1 ^ (h2 << 1)
  };
};

This only makes sense for pretty large sets of floats, because otherwise the hashing overhead probably exceeds the benefit of using a hashed data structure in the first place.

Old answer for reference:

Objects with the same hash are not considered equal in unordered_set. They are just stored in the same bucket. There is a KeyEqual template parameter for the comparison, which by default uses the operator== of your Key. So your main problem is, that comp should implement e1 == e2 and not e1 < e2 (and should probably be called equal).

The hash is just used to speed up the search, insertion, and removal of elements.

On another note, you may want to use the hashes of the member variables instead of the values themselves to compute the hash of edge:

struct hash {
  size_t operator()(const edge& e1) const {
    auto h1 = std::hash<float>{}(e1.a1);
    auto h2 = std::hash<float>{}(e1.a2);
    return h1 ^ (h2 << 1)
  };
};

This way, you won't get the same hash for two edges with swapped coordinates. This way of combining hashes is suggested here (but is not a good way to combine more than two).

eike
  • 969
  • 5
  • 17
  • If I understood the question correctly, the intention was to have `comp` ignore the order of `a1` and `a2`. And then the problem is that the OP's hash function is not correct. – Evg Dec 03 '19 at 12:15
  • @Evg Well if OP implements an equals operation instead of a comparison for `KeyEqual` that solves the problem. – eike Dec 03 '19 at 12:25
  • @eike the intention is to ignore order of a1 and a2 since for edge, it does not matter which one is first – Athos Dec 03 '19 at 12:49
  • @Athos Ok, that explains a lot. I completely misunderstood your question. Editing now... – eike Dec 03 '19 at 13:01
1

You don't have to use the members of edge to provide the hash. It is only required that equal values have equal hashes. A "always valid" hash is

struct hash{
    size_t operator()(const edge& e1) const {
        return 0;
    };
};

But it seems your original attempt is better

struct hash{
    size_t operator()(const edge& e1) const {
        return std::hash<float>{}(e1.a1 + e1.a2); // + is commutative
    };
};
Caleth
  • 35,377
  • 2
  • 31
  • 53
  • That would render every benefit of `unordered_set` nullified. Why not just use `set` then? – eike Dec 03 '19 at 11:59
  • @eike Not every benefit. OP can still use their *KeyEqual* (confusingly called `comp`) – Caleth Dec 03 '19 at 12:00
  • But his implementation, as it seems, is not supposed to implement an equals operation. – eike Dec 03 '19 at 12:03
  • With your `hash`, `size()` returns 1, with OP's `hash`, `size()` returns 2. Why? – Evg Dec 03 '19 at 12:04
  • 1
    @Evg OPs `hash` + `comp` gives undefined behaviour – Caleth Dec 03 '19 at 12:05
  • Different hashes for same values... Got it. – Evg Dec 03 '19 at 12:06
  • @Evg because for OPs version of comp: `comp(edge{1.1, 2.2}, edge{2.2, 1.1}) == true` – eike Dec 03 '19 at 12:07
  • @Evg No, the hash for two equal values is always the same, even with OPs implementation. – eike Dec 03 '19 at 12:08
  • 2
    @eike, "same" as defined by `comp`. – Evg Dec 03 '19 at 12:09
  • @Evg Oh sorry, misunderstood that. – eike Dec 03 '19 at 12:11
  • @Caleth Now that I understand the question (and hence your answer), I have to advise against addition, because that is not commutative for `float`. – eike Dec 03 '19 at 13:06
  • @eike it's commutative, but not associative – Caleth Dec 03 '19 at 13:08
  • @Caleth According to https://stackoverflow.com/a/24446382/4253931 it isn't, according to other sources it is. I don't know who to trust on this one. In general IEEE754 float addition is commutative, but I haven't found a quote from the C++ the standard about this. – eike Dec 03 '19 at 13:21