How to extract certain values from list and put it in another list

Question

I am having a really hard time doing this. I have a code block setup where it checks each and every value from one list to another. How i want it to work is when two values are similar it is supposed to either copy or duplicate that value into a completely separate list. So far all my efforts have failed. Can anyone please shed some light on this matter?

void compare_list_warning(list<string>*list_one,list<string>*list_two)
{
  cout << "Made a function call!!!" << endl;

  list<string> duplicates;
  string line;
  string *ptline = &line;

  list<string>::iterator pt1 = list_one->begin();

  cout << "About to compare the lists" << endl;
  for (int x = 0; x < list_one->size(); ++x)
  {
    list<string>::iterator pt2 = list_two->begin();
    for (int x = 0; x < list_two->size(); ++x)
    {
    if (*pt1 == *pt2)
    {
        *pt1 = *ptline;
        duplicates.push_back(line);

    }
    else
    {
        ++pt2;

    }
   }

    ++pt1;
  }

http://www.cplusplus.com/reference/algorithm/set_intersection/ — Johnny Mopp, Jun 03 '14 at 13:19
Also, don't use raw pointers when unnecessary, and use `const_iterator` if you don't need to modify the list (which you don't, in this case). — The Paramagnetic Croissant, Jun 03 '14 at 13:20
First remove all use of pointers. They are completely unnecessary and only confuse the logic. Then look at which direction you're assigning values. — molbdnilo, Jun 03 '14 at 13:23
possible duplicate of [how to find the intersection of two std:set in C++?](http://stackoverflow.com/questions/13448064/how-to-find-the-intersection-of-two-stdset-in-c) — aruisdante, Jun 03 '14 at 13:24
You need to be more specific what you want. Do you want a list of strings that are common between two unsorted lists? What is the intent of `*pt1 = *ptline;`? — Blastfurnace, Jun 03 '14 at 13:58
@user3330589 Did you check my solution and JB Jansen's tests? — AK_, Jun 05 '14 at 07:56

score 1 · Answer 1 · answered Jun 03 '14 at 13:58

How i want it to work is when two values are similar it is supposed to either copy or duplicate that value into a completely separate list.

You're complicating things a lot with unnecessary variables and pointers, when you can just translate rather straightforwardly into C++.

From your description, you want a function that takes two lists of strings and returns a list of the strings that exist in both input lists.

That is,

list<string> duplicates(list<string> list_one, list<string> list_two)
{

We're going to need a separate list to return from the function.

    list<string> duplicates;

That's all the variables we're going to need.

Now, traverse both lists looking for duplicates.
The x you're using for iteration serves no real purpose, and also confuses.
Let's use the more idiomatic iterator loops.

For each item in list_one

    for (list<string>::iterator p1 = list_one.begin(); p1 != list_one.end(); ++p1)
    {

Go through each item in list_two

        for (list<string>::iterator p2 = list_two.begin(); p2 != list_two.end(); ++p2)
        {

And if the two items are the same, you add one of them to the duplicates list.

            if (*pt1 == *pt2)
            {
                duplicates.push_back(*pt1);
            }
        }
    }

And when you've looked through everything, return the result

    return duplicates;

}

(You had a bug here where you overwrote the current item in list_one with the empty string instead of the other way around. Also, you only advanced an iterator if you didn't find a match.)

Thank you for your time in replying. This solution worked out great and a lot less headachey than my way. Thanks! — Uys of Spades, Jun 03 '14 at 22:14

score 1 · Answer 2 · edited Jul 16 '14 at 22:54

1

This is a set intersection problem.

I tried to optimize the code as much as possible. This will run on O(N+M) where M,N are the sizes of your sets respectively.

This compiles on C++11

#include <iostream>
#include <string>
#include <unordered_set>
#include <list>
#include <algorithm>
using namespace std;

list<string> compare_list_warning(list<string> const& list_one, list<string> const& list_two)
{
  list<string> const& small_list = list_one.size() < list_two.size() ? list_one : list_two;
  list<string> const& big_list = list_one.size() < list_two.size() ? list_two : list_one;

  list<string> duplicates;
  unordered_set<string> duplicate_finder;
  duplicate_finder.insert(begin(small_list), end(small_list));
  copy_if(
      begin(big_list), end(big_list)
      , back_inserter(duplicates)
      , [&](string const& v) { 
        return (duplicate_finder.find(v) != end(duplicate_finder)); 
      }
  );
  return  duplicates;
}

int main() {
  list<string> common_names = 
    compare_list_warning(
      {"one", "two", "three", "four"}
      , {"three", "one", "five", "xx", "two"}
    );

  for(string const& common : common_names)
    std::cout << common << "\t";
  return 0;
}

there is your output from GCC 4.9 or Clang 3.4 :

three   one     two

edited Jul 16 '14 at 22:54

Jean-Bernard Jansen

6,172
2
18
17

answered Jun 03 '14 at 14:48

AK_

1,363
3
16
28

2

Wouldn't it be simpler to use a `std::unordered_set` since you only need to store and search for keys and not key/value pairs? You can also build the `std::unordered_set` by simply doing `hash_table.insert(list_one->begin(), list_one->end());` instead of a loop. – Blastfurnace Jun 03 '14 at 15:01
Nice, but I think you dont have to "duplicate" the code regarding the choice of the smaller list. Just assignate two references, the first witht the smaller list and the second with the longest. Alos, be aware that the hash computation may cost time. – Jean-Bernard Jansen Jun 03 '14 at 15:27
1

This answer can be rewritten it a lot fewer lines for the same algoritm. For instance: https://gist.github.com/duckie/4833bc83976812ee6d95 – Jean-Bernard Jansen Jun 03 '14 at 16:08
@JBJansen I totally agree with you, but when you explain it someone you must write a lengthy code for better understanding :) – AK_ Jun 03 '14 at 17:39
@Blastfurnace Actually you're right, but I am not sure about time complexity? care to edit and re-compile? – AK_ Jun 03 '14 at 17:40
If I were to rewrite your code with `std::unordered_set` it would look something like [this (on ideone.com)](http://ideone.com/Nsink9). This is very similar to JB Jansen's second example. – Blastfurnace Jun 03 '14 at 18:40
@JBJansen you actually did a performance test? That is AWESOME. Thank you! *RESPECT FLAG* – AK_ Jun 04 '14 at 21:58

Jean-Bernard Jansen · Answer 3 · 2014-06-04T14:07:19.257

molbdnilo answer is great and simple.

Here is one overkill way to do it (C++11 only). It should perform better on huge datasets. Code stays nicely small. You can also implement the logic of "similar" elements by providing a comparison function other than std::less.

#include <iostream>
#include <string>
#include <set>
#include <unordered_set>
#include <list>
#include <algorithm>
using namespace std;

template <typename T> using intersection_set = set<reference_wrapper<T const>, less<T>>;

list<string> compare_list_warning(list<string> const& list_one, list<string> const& list_two)
{
  intersection_set<string> set_one, set_two;
  set_one.insert(begin(list_one), end(list_one));
  set_two.insert(begin(list_two), end(list_two));
  list<string> output;
  set_intersection(begin(set_one), end(set_one), begin(set_two), end(set_two), back_inserter(output), less<string>());
  return output;
}

int main() {
  list<string> common_names = compare_list_warning({"Roger", "Marcel", "Camille", "Hubert"}, {"Huguette", "Cunegond", "Marcelle", "Camille"});
  for(string const& common : common_names) {
    std::cout << "Common element: " << common << "\n";
  }
  std::cout << std::flush;
  return 0;
}

EDIT

AK_ answer is interesting in terms of complexity, though I would have written it like this:

list<string> compare_list_warning(list<string> const& list_one, list<string> const& list_two)
{
  list<string> const& small_list = list_one.size() < list_two.size() ? list_one : list_two;
  list<string> const& big_list = list_one.size() < list_two.size() ? list_two : list_one;
  list<string> duplicates;
  unordered_set<string> duplicate_finder;
  duplicate_finder.reserve(small_list.size());
  duplicate_finder.insert(begin(small_list), end(small_list));
  copy_if(begin(big_list), end(big_list), back_inserter(duplicates), [&](string const& v) { return (duplicate_finder.find(v) != end(duplicate_finder)); });
  return duplicates;
}

EDIT 2

AK_'s algorithm is the fastest. After performances tests, it runs about 10 times faster than mine. You can find full code for performance test here. If you need to avoid the collision risk, a set can be used instead of the unordered_set, it is still a bit faster.

What is the total time complexity in terms of list_one and list_two sizes? — AK_, Jun 03 '14 at 14:52
Thats easy to compute. Here it is `O(n*(2+log(n)) + m*(2+log(m)))` in the worst case. That is not as good as `O(n+m)`, but there is no hash to compute. — Jean-Bernard Jansen, Jun 03 '14 at 15:30
We I have no idea about the nature of the strings stored in the list. I would suggest something else than hashing a string. Maybe hashing the length of string or ascii total, etc.. — AK_, Jun 03 '14 at 17:38
@AK_ : That is not easy to find a good hash, even with strong hypothesis on the strings. I also think that a `reserve` on the hash map would be a good idea about performances. — Jean-Bernard Jansen, Jun 03 '14 at 21:40

score 0 · Answer 4 · answered Jun 03 '14 at 15:15

Here I have generated two lists(as you can see, then compared their values).The program stores the common value in the "common_list" as you can see by printing these values. Now, you can manipulate these lines to get whatever you want!

#include <iostream>
#include <list>
#include <string>
using namespace std;

void compare_list_warning(list<string>list_one, list<string>list_two);

int main()
{
    list <string> list1;
    list <string> list2;

    list1.push_back("hi");
    list1.push_back("there");
    list1.push_back("I am");
    list1.push_back("abc");

    list2.push_back("hello");
    list2.push_back("something_random");
    list2.push_back("there");
    list2.push_back("greetings");
    list2.push_back("cheers");

    compare_list_warning(list1, list2);    //here calling the function

    system("pause");
    return 0;
}


//defining the function here

void compare_list_warning(list<string>list_one, list<string>list_two)
{
    list<string> ::iterator pt1 = list_one.begin();
    list<string> ::iterator pt2 = list_two.begin();
    list<string> common_list;

    for (int i = 0; i < list_one.size(); ++i)
    {
        for (int j = 0; j < list_two.size(); ++j)
        {
            if (*pt1 == *pt2)
            {
                common_list.push_back(*pt1);
            }
            ++pt2;
        }
        pt2 = list_two.begin();    //this is important; reset the iterator
        ++pt1;

    }


    list<string> ::iterator pt3 = common_list.begin();
    cout << "Here is list of common values: " << endl;
    for (int i = 0; i < common_list.size(); ++i)
    {
        cout << *pt3 << endl;
        ++pt3;
    }
}

// bingo! it prints "there"

I guess, the important point here was to reset the iterator pt2! good luck!

for (size_t j = list_two.begin(); j < list_two.size(); ++j) // this is what you want — AK_, Jun 03 '14 at 20:11
yes, that's another and better way to put it. thanks! and i guess @user3330589 skipped this re-setting part, that's why, he got errors! — aniliitb10, Jun 04 '14 at 04:00

How to extract certain values from list and put it in another list

4 Answers4