10

I have a situation where I will be receiving 2+ ArrayList<Widget> and I need to be able to merge all the lists and remove any duplicate Widget so that I wind up with only 1 ArrayList<Widget> that contains all Widgets from all the merged lists, but without any duplicates.

Assume Widget has an overridden equals method that can be used for determining whether two Widgets are duplicates, although there may be a better way:

public ArrayList<Widget> mergeAndRemoveDupes(ArrayList<Widget> widgets...) {
    // ???
}

Looking for the most algorithmically efficient way of accomplishing this. I am happy to use Apache Commons or any other open source libs that would help me out too! Thanks in advance!

IAmYourFaja
  • 50,141
  • 159
  • 435
  • 728

3 Answers3

11

For each ArrayList<Widget>, add each element to a Set<Widget> (HashSet or TreeSet, depending on whether they can be ordered in some way, or are hashable) utilizing addAll. Sets contain no duplicates by default.

You can convert this Set back into an (Array)List if you need to at the end.

Note you will need to implement hashCode for your Widget class if you decide to use a HashSet, but if you have an overridden equals, you should do this anyway.

Edit: Here's an example:

//Either the class itself needs to implement Comparable<T>, or a similar
//Comparable instance needs to be passed into a TreeSet 
public class Widget implements Comparable<Widget>
{
    private final String name;
    private final int id;

    Widget(String n, int i)
    {
        name = n;
        id = i;
    }

    public String getName()
    {
        return name;
    }

    public int getId()
    {
        return id;
    }

    //Something like this already exists in your class
    @Override
    public boolean equals(Object o)
    {
        if(o != null && (o instanceof Widget)) {
            return ((Widget)o).getName().equals(name) &&
                   ((Widget)o).getId() == id;
        }
        return false;
    }

    //This is required for HashSet
    //Note that if you override equals, you should override this
    //as well. See: http://stackoverflow.com/questions/27581/overriding-equals-and-hashcode-in-java
    @Override 
    public int hashCode()
    {
        return ((Integer)id).hashCode() + name.hashCode();
    }

    //This is required for TreeSet
    @Override
    public int compareTo(Widget w)
    {
        if(id < w.getId()) return -1;
        else if(id > w.getId()) return 1;
        return name.compareTo(w.getName());
    }

    @Override 
    public String toString()
    {
        return "Widget: " + name + ", id: " + id;
    }
}

If you want to use a TreeSet but don't want to implement Comparable<T> on your Widget class, you can give the set itself a Comparator object:

private Set<Widget> treeSet;
....
treeSet = new TreeSet<Widget>(new Comparator<Widget>() {
            public int compare(Widget w1, Widget w2)
            {
                if(w1.getId() < w2.getId()) return -1;
                else if(w1.getId() > w2.getId()) return 1;
                return w1.getName().compareTo(w2.getName());
            }
           });
Yuushi
  • 22,789
  • 6
  • 58
  • 73
  • Wow thanks @Yuushi (+1) - will I get a runtime exception if I try to add a dupe to the set? Or will Java just ignore the added dupe (which is what I want). Thanks again! – IAmYourFaja May 09 '13 at 02:18
  • Java will just ignore the added duplicate – Zim-Zam O'Pootertoot May 09 '13 at 02:19
  • 1
    @IAmYourFaja It will simply ignore the dupe. – Yuushi May 09 '13 at 02:19
  • 1
    Assuming that each list is already mathematically a set, it would be faster to use Set.addAll instead of aging one element at a time. – Chris Pitman May 09 '13 at 02:20
  • Chris, addAll takes a collection, not only a set. Therefore any List is also a valid candidate for [addAll](http://docs.oracle.com/javase/6/docs/api/java/util/Set.html#addAll(java.util.Collection)) provided its containing class responds to hashCode – hd1 May 09 '13 at 02:23
  • Last followup @Yuushi (+1 again) - remember, these individual lists are actually coming from totally separate sources before they get passed to the `mergeAndRemoveDupes` method. So I *need* to use `equals()` (or something similar) to guarantee dupes don't wind up in the final (merged) list. So I ask: how does `Set` know to ignore dupes? `equals()`? Something else? Thanks again! – IAmYourFaja May 09 '13 at 02:35
  • @IAmYourFaja `HashSet` uses `hashCode` - two things that hash to the same value are considered equal. `TreeSet` uses a comparator - thus expecting your class to be of the form `Widget implements Comparable`, or when you construct the `TreeSet`, to pass in a Comparator (for example, `TreeSet(new Comparable(...))` – Yuushi May 09 '13 at 02:45
  • @IAmYourFaja I've added an example of what you (may) need to add to your `Widget` class to allow addition into a `Set`. – Yuushi May 09 '13 at 03:08
  • 1
    @Yuushi Two items with the same hash code are considered *possibly equal*. Set will still call equals after a hash match to check if the items are actually the same! – Chris Pitman May 09 '13 at 13:42
  • @ChrisPitman Thanks...I've been in C++ land for too long, I haven't touched Java for a while, so I'm forgetting some of the intricacies. – Yuushi May 10 '13 at 01:29
  • can this affect performence in case we are dealing with thousands of objects ? – Genjuro Jul 16 '13 at 15:55
9

I would do it this way

Set<Widget> set = new HashSet<>(list1);
set.addAll(list2);
List<Widget> mergeList = new ArrayList<>(set);
Hovercraft Full Of Eels
  • 276,051
  • 23
  • 238
  • 346
Evgeniy Dorofeev
  • 124,221
  • 27
  • 187
  • 258
2

Use Set Collection Class,

ArrayList<Widget> mergeList = new ArrayList<widget>();
mergeList.addAll(widgets1);
mergeList.addAll(widgets2);
Set<Widget> set  = new HashSet<Widget>(mergeList);
ArrayList<Widget> mergeListWithoutDuplicates = new ArrayList<widget>();
mergeListWithoutDuplicates .addAll(set);
return mergeListWithoutDuplicates;

Now here Set will remove all duplicates values from your ArrayList.

buptcoder
  • 2,642
  • 17
  • 22
  • Thanks @buptcoder (+1) - please see my last question to Yuushi in his/her answer above - I have the same question for you! – IAmYourFaja May 09 '13 at 02:36