-1

In C++, what's the quickest way (or decent way) to check each element in a string vector of size approx. 800,000 to see if it's in another string vector of approx. size 200,000? My goal is to push all the strings of the first that are found in the second into a third.

My beginner attempt is never going to stop running:

vector<string> combosVsWords(vector<string> words, vector<string> lettercombos)
{
    vector<string> firstwords;

    for (int i = 0; i != lettercombos.size(); i++)
    {
        if (find(words.begin(), words.end(), lettercombos[i]) !=   words.end())
            firstwords.push_back(lettercombos[i]);
    }       
}
Community
  • 1
  • 1
Austin
  • 5,355
  • 4
  • 40
  • 106
  • Could you possibly put each vector into its own STL set and create a third set using the STL intersection function? – Jacob Calvert Jul 05 '15 at 04:05
  • To be honest I have no idea what that means I'm very new to C++ and programming in general really. edit: oh, standard library, hmm let me research that a bit. – Austin Jul 05 '15 at 04:06
  • If your vector's that you're passing in have so many strings, you should pass them to your function by const reference, not by value. Second, is it ok to have your words and lettercombos sorted? If so, then the first suggestion of using `std::set_intersection` would be an option. – PaulMcKenzie Jul 05 '15 at 04:08
  • do I just put `const` in front? Hopefully that's not a pointer sort of thing because I haven't got to those yet. Yep sorting is okay. I'll check out the intersection function. – Austin Jul 05 '15 at 04:09
  • 1
    @AustinMW `Hopefully that's not a pointer sort of thing ` You asked "what is the quickest way". Nothing else states what you can or cannot use. Anyway, I posted an answer that, if it is ok to sort the vectors, a way to get the intersection of the two vectors. – PaulMcKenzie Jul 05 '15 at 04:18

2 Answers2

2

If the vectors can be sorted, then the following should work using std::set_intersection:

#include <algorithm>
#include <vector>
#include <string>
#include <algorithm>
#include <iterator>
//...
using namespace std;

vector<string> combosVsWords(vector<string>& words, 
                             vector<string>& lettercombos)
{
    vector<string> firstwords;

    // Sort the vectors 
    sort(words.begin(), words.end());
    sort(lettercombos.begin(), lettercombos.end());

    // get the set intersection of the vectors and place
    // the result in firstwords
    set_intersection(words.begin(), words.end(), lettercombos.begin(), 
                     lettercombos.end(), back_inserter(firstwords));

    return firstwords;
}
PaulMcKenzie
  • 31,493
  • 4
  • 19
  • 38
1

What you could do it put each vector into a set like:

std::set<std::string> setA (vectorA.begin(), vectorA.end()), setB (vectorB.begin(), vectorB.end());

Then get the sets' intersection like:

std::set<int> intersect;

set_intersection(setA.begin(),setA.end(),setB.begin(),setB.end(),
              std::inserter(intersect,intersect.begin()))

The values in intersect will be the overlapping values from setA and setB.

These questions might also help. How to convert a vector to a set

How to get set intersection

Community
  • 1
  • 1
Jacob Calvert
  • 168
  • 1
  • 2
  • 11