molbdnilo answer is great and simple.
Here is one overkill way to do it (C++11 only). It should perform better on huge datasets. Code stays nicely small. You can also implement the logic of "similar" elements by providing a comparison function other than std::less
.
#include <iostream>
#include <string>
#include <set>
#include <unordered_set>
#include <list>
#include <algorithm>
using namespace std;
template <typename T> using intersection_set = set<reference_wrapper<T const>, less<T>>;
list<string> compare_list_warning(list<string> const& list_one, list<string> const& list_two)
{
intersection_set<string> set_one, set_two;
set_one.insert(begin(list_one), end(list_one));
set_two.insert(begin(list_two), end(list_two));
list<string> output;
set_intersection(begin(set_one), end(set_one), begin(set_two), end(set_two), back_inserter(output), less<string>());
return output;
}
int main() {
list<string> common_names = compare_list_warning({"Roger", "Marcel", "Camille", "Hubert"}, {"Huguette", "Cunegond", "Marcelle", "Camille"});
for(string const& common : common_names) {
std::cout << "Common element: " << common << "\n";
}
std::cout << std::flush;
return 0;
}
EDIT
AK_ answer is interesting in terms of complexity, though I would have written it like this:
list<string> compare_list_warning(list<string> const& list_one, list<string> const& list_two)
{
list<string> const& small_list = list_one.size() < list_two.size() ? list_one : list_two;
list<string> const& big_list = list_one.size() < list_two.size() ? list_two : list_one;
list<string> duplicates;
unordered_set<string> duplicate_finder;
duplicate_finder.reserve(small_list.size());
duplicate_finder.insert(begin(small_list), end(small_list));
copy_if(begin(big_list), end(big_list), back_inserter(duplicates), [&](string const& v) { return (duplicate_finder.find(v) != end(duplicate_finder)); });
return duplicates;
}
EDIT 2
AK_'s algorithm is the fastest. After performances tests, it runs about 10 times faster than mine. You can find full code for performance test here. If you need to avoid the collision risk, a set
can be used instead of the unordered_set
, it is still a bit faster.