26

I have the following code to make an unordered_set<Interval>. This compiles fine.

struct Interval {
  unsigned int begin;
  unsigned int end;
  bool updated;   //true if concat.  initially false
  int patternIndex;  //pattern index. valid for single pattern
  int proteinIndex;   //protein index.  for retrieving the pattern
};

struct Hash {
  size_t operator()(const Interval &interval);
};

size_t Hash::operator()(const Interval &interval){
  string temp = to_string(interval.begin) + to_string(interval.end) + to_string(interval.proteinIndex);
  return hash<string>()(temp);
}

unordered_set<Interval, string, Hash> test;

However, I cannot compile when I try to insert using this code:

for(list<Interval>::iterator i = concat.begin(); i != concat.end(); ++i){
  test.insert((*i));
}

Moreover, I cannot determine what the problem is from the error messages, for example:

note: candidate is:
note: size_t Hash::operator()(const Interval&)
note:   candidate expects 1 argument, 2 provided  

I thought I only provided 1 argument...

What is the problem with my insertion code?


Here's the new instantiation code: unordered_set<Interval, Hash> test; However, I'm still receiving a slew of error messages, for example:

note: candidate is:
note: size_t Hash::operator()(const Interval&) <near match>
note:   no known conversion for implicit ‘this’ parameter from ‘const Hash*’ to ‘Hash*’
honk
  • 7,217
  • 11
  • 62
  • 65
user2052561
  • 1,209
  • 2
  • 12
  • 18

2 Answers2

40

First problem:

You are passing string as the second template argument for your instantiation of the unordered_set<> class template. The second argument should be the type of your hasher functor, and std::string is not a callable object.

Perhaps meant to write:

unordered_set<Interval, /* string */ Hash> test;
//                      ^^^^^^^^^^^^
//                      Why this?

Also, I would suggest using names other than begin and end for your (member) variables, since those are names of algorithms of the C++ Standard Library.

Second problem:

You should keep in mind, that the hasher function should be qualified as const, so your functor should be:

struct Hash {
   size_t operator() (const Interval &interval) const {
   //                                           ^^^^^
   //                                           Don't forget this!
     string temp = to_string(interval.b) + 
                   to_string(interval.e) + 
                   to_string(interval.proteinIndex);
     return (temp.length());
   }
};

Third problem:

Finally, if you want std::unordered_set to be able to work with objects of type Interval, you need to define an equality operator consistent with your hash function. By default, if you do not specify any type argument as the third parameter of the std::unordered_set class template, operator == will be used.

You currently do not have any overload of operator == for your class Interval, so you should provide one. For example:

inline bool operator == (Interval const& lhs, Interval const& rhs)
{
    return (lhs.b == rhs.b) && 
           (lhs.e == rhs.e) && 
           (lhs.proteinIndex == rhs.proteinIndex); 
}

Conclusion:

After all the above modifications, you can see your code compiling in this live example.

Andy Prowl
  • 114,596
  • 21
  • 355
  • 432
  • Is there a way to wrap `operator ==` in a `struct`? As it is, I get errors about multiple definitions of `==`. If I declare it in a `struct` in my header file and define it in my .cpp file (as I'm currently doing for other operators), it gives me errors about it requiring exactly one argument. Ex: `bool IntervalEquality::operator==(const Interval&, const Interval&)’ must take exactly one argument` – user2052561 Apr 08 '13 at 00:54
  • @user2052561: The problem with multiple definition is probably happens because you have that function definition in one header and you are importing it in multiple translation units (i.e. `#include`ing it from several `.cpp` files). Just use the `inline` keyword to suppress this problem (as I show in my updated answer). – Andy Prowl Apr 08 '13 at 07:28
  • 1
    @user2052561: Alternatively, if you want to overload it as a member function, then you should remove the `lhs` parameter, which is implicitly. The `this` pointer will refer to the left hand `Interval` object. – Andy Prowl Apr 08 '13 at 07:29
  • It's worth noting that this hash function can produce lots of easily-avoidable collisions – an `Interval` with `b == 12` and `e == 3` will produce the same hash as an `Interval` with `b == 1` and `e == 23`. – ildjarn Apr 17 '13 at 23:22
  • @ildjarn: Sure, that's true. My hash function does not have any ambition to be a good hash function - I just wanted to point out what was missing conceptually and provide a minimal working filler. – Andy Prowl Apr 17 '13 at 23:32
  • Understood – I guess my comment was directed more towards the OP than your answer, sorry for posting it in the wrong place. :-P – ildjarn Apr 17 '13 at 23:33
  • @AndyProwl, I am not very sure whether the hash function and equal function are defined in the customer's class as what java does. Is there a better way that makes these functions related with the class closer ? – city Dec 20 '13 at 06:01
  • This answer is awesome, I had to look through like 12 SO questions before finding the thing about the inline operator. Thank you so much for answering this. – Onir Sep 27 '17 at 23:00
  • 1
    @Andy Prowl - The "live example" link is not alive; it is a dead link: http://liveworkspace.org/code/4yc7Hy%243799. Can you paste the working code directly into the answer? – David J. Dec 07 '17 at 19:19
0

I think, Andy Prowl perfectly fixed the problems with your code. However, I would add the following member function to your Interval, which describes what makes two intervals identical:

std::string getID() const { return std::to_string(b) + " " + std::to_string(e) + " " + std::to_string(proteinIndex); }

Please note, that I also followed Andy Prowl's suggestion and renamed the members begin to b and end to e. Next, you can easily define the hash and comparison functions by using lambda expressions. As a result, you can define your unordered_set as follows:

auto hash = [](const Interval& i){ return std::hash<std::string>()(i.getID()); };
auto equal = [](const Interval& i1, const Interval& i2){ return i1.getID() == i2.getID(); };
std::unordered_set<Interval, decltype(hash), decltype(equal)> test(8, hash, equal);

Finally, for reasons of readability, I converted your for loop into a range-based for loop:

std::list<Interval> concat {{1, 2, false, 3, 4}, {2, 3, false, 4, 5}, {1, 2, true, 7, 4}};

for (auto const &i : concat)
    test.insert(i);

for (auto const &i : test)
    std::cout << i.b << ", " << i.e << ", " << i.updated << std::endl;

Output (I just printed first three members of each Interval):

2, 3, 0
1, 2, 0

As you can see, there are only two intervals printed. The third one ({1, 2, true, 7, 4}) was not inserted to concat, because its b, e, and proteinIndexare equal to that of the first interval ({1, 2, false, 3, 4}).

Code on Ideone

honk
  • 7,217
  • 11
  • 62
  • 65