88

I have a list of lists which I want to find the intersection for like this:

var list1 = new List<int>() { 1, 2, 3 };
var list2 = new List<int>() { 2, 3, 4 };
var list3 = new List<int>() { 3, 4, 5 };
var listOfLists = new List<List<int>>() { list1, list2, list3 };

// expected intersection is List<int>() { 3 };

Is there some way to do this with IEnumerable.Intersect()?

EDIT: I should have been more clear on this: I really have a list of lists, I don't know how many there will be, the three lists above was just an example, what I have is actually an IEnumerable<IEnumerable<SomeClass>>

SOLUTION

Thanks for all great answers. It turned out there were four options for solving this: List+aggregate (@Marcel Gosselin), List+foreach (@JaredPar, @Gabe Moothart), HashSet+aggregate (@jesperll) and HashSet+foreach (@Tony the Pony). I did some performance testing on these solutions (varying number of lists, number of elements in each list and random number max size.

It turns out that for most situations the HashSet performs better than the List (except with large lists and small random number size, because of the nature of HashSet I guess.) I couldn't find any real difference between the foreach method and the aggregate method (the foreach method performs slightly better.)

To me, the aggregate method is really appealing (and I'm going with that as the accepted answer) but I wouldn't say it's the most readable solution.. Thanks again all!

Oskar
  • 7,215
  • 4
  • 32
  • 42

8 Answers8

77

How about:

var intersection = listOfLists
    .Skip(1)
    .Aggregate(
        new HashSet<T>(listOfLists.First()),
        (h, e) => { h.IntersectWith(e); return h; }
    );

That way it's optimized by using the same HashSet throughout and still in a single statement. Just make sure that the listOfLists always contains at least one list.

Kevin Babcock
  • 9,828
  • 18
  • 66
  • 88
Jesper Larsen-Ledet
  • 6,365
  • 3
  • 28
  • 42
  • 1
    Wow, No way that I could have think myself about this solution. Once you have the solution, it seems obvious.....hummmm, no, I will leave a comment just to be sure my coworkers will not think that I take too much weed :) – Samuel Jan 19 '16 at 22:05
  • functional paradigm wins ) – anatol Mar 15 '18 at 04:53
  • why is there a need for the Skip? Asking because I don't know – Issa Fram Apr 30 '18 at 15:02
  • Skip is there because the first element is used for the initial populate of the hashset. You must do this, because otherwise it's a bunch of intersections with an empty set. – SirPentor May 18 '18 at 01:16
  • I understand the solution. I guess e stands for enumerator? Can I also ask what h stands for? I guess h stands for HashSet? – Quan Feb 04 '20 at 19:14
64

You can indeed use Intersect twice. However, I believe this will be more efficient:

HashSet<int> hashSet = new HashSet<int>(list1);
hashSet.IntersectWith(list2);
hashSet.IntersectWith(list3);
List<int> intersection = hashSet.ToList();

Not an issue with small sets of course, but if you have a lot of large sets it could be significant.

Basically Enumerable.Intersect needs to create a set on each call - if you know that you're going to be doing more set operations, you might as well keep that set around.

As ever, keep a close eye on performance vs readability - the method chaining of calling Intersect twice is very appealing.

EDIT: For the updated question:

public List<T> IntersectAll<T>(IEnumerable<IEnumerable<T>> lists)
{
    HashSet<T> hashSet = null;
    foreach (var list in lists)
    {
        if (hashSet == null)
        {
            hashSet = new HashSet<T>(list);
        }
        else
        {
            hashSet.IntersectWith(list);
        }
    }
    return hashSet == null ? new List<T>() : hashSet.ToList();
}

Or if you know it won't be empty, and that Skip will be relatively cheap:

public List<T> IntersectAll<T>(IEnumerable<IEnumerable<T>> lists)
{
    HashSet<T> hashSet = new HashSet<T>(lists.First());
    foreach (var list in lists.Skip(1))
    {
        hashSet.IntersectWith(list);
    }
    return hashSet.ToList();
}
Jon Skeet
  • 1,261,211
  • 792
  • 8,724
  • 8,929
  • Yeah, the foreach makes sense. Any performance difference with this compared to the Aggregate method in Marcel's answer? – Oskar Nov 04 '09 at 16:31
  • @Oskar: Yes, my answer uses a single hashset instead of creating a new one each time. However, you could still use Aggregate with a set... will edit. – Jon Skeet Nov 04 '09 at 17:32
  • Ick... just tried to work out an Aggregate solution, and it's icky because HashSet.IntersectWith returns null :( – Jon Skeet Nov 04 '09 at 17:37
  • 1
    Hi. One question regarding your `IntersectAll()` method (which is handful) : is there a simple way to add a selector as parameter, to compare values (eg : `Func selector`) and still use `InsertectWith()` ? – tigrou Aug 06 '14 at 13:28
  • @tigrou: Not terribly easily - because you'd still want to return a `List` rather than a `List`, right? The best approach would probably be to create an `EqualityComparer` which was implemented by projecting to `TKey`. – Jon Skeet Aug 06 '14 at 13:29
  • @JonSkeet Do you mean something like this : http://stackoverflow.com/questions/5909259/generic-iequalitycomparert-and-gethashcode ? (and passing the `GenericEqualitycomparer` instance to `HashSet` constructor) – tigrou Aug 06 '14 at 14:44
  • @tigrou: That's not how I'd build it - I'd use a `ProjectionEqualityComparer` that took a `Func` and used that for equality and hashing... but yes, that's the general shape. – Jon Skeet Aug 06 '14 at 14:45
  • One question : is returning a `List()` done on purpose (to fit OP example or avoid some unexpected behavior) ? will it be a good idea to return a IEnumerable instead and yield the results from the HashSet ? – tigrou Oct 28 '14 at 13:41
  • @tigrou: I honestly can't remember why I happened to do it that way five years ago... – Jon Skeet Oct 28 '14 at 20:57
30

Try this, it works but I'd really like to get rid of the .ToList() in the aggregate.

var list1 = new List<int>() { 1, 2, 3 };
var list2 = new List<int>() { 2, 3, 4 };
var list3 = new List<int>() { 3, 4, 5 };
var listOfLists = new List<List<int>>() { list1, list2, list3 };
var intersection = listOfLists.Aggregate((previousList, nextList) => previousList.Intersect(nextList).ToList());

Update:

Following comment from @pomber, it is possible to get rid of the ToList() inside the Aggregate call and move it outside to execute it only once. I did not test for performance whether previous code is faster than the new one. The change needed is to specify the generic type parameter of the Aggregate method on the last line like below:

var intersection = listOfLists.Aggregate<IEnumerable<int>>(
   (previousList, nextList) => previousList.Intersect(nextList)
   ).ToList();
Marcel Gosselin
  • 4,430
  • 2
  • 25
  • 50
  • Thanks, I just tried that out and it works! Havn't used Aggregate() before but I guess it was something like this I was looking for. – Oskar Nov 04 '09 at 16:25
  • As I have specified as a comment on Tony's answer, I believe his solution will perform better. – Marcel Gosselin Nov 04 '09 at 17:08
  • 3
    You can get rid of the .ToList() in the aggregate if you use Aggregate> – pomber Jul 26 '16 at 20:18
  • @pomber, I can't believe your comment has gone 3 years without an upvote. Well today is your day my friend. – Sean May 13 '19 at 14:12
5

You could do the following

var result = list1.Intersect(list2).Intersect(list3).ToList();
JaredPar
  • 673,544
  • 139
  • 1,186
  • 1,421
5

This is my version of the solution with an extension method that I called IntersectMany.

public static IEnumerable<TResult> IntersectMany<TSource, TResult>(this IEnumerable<TSource> source, Func<TSource, IEnumerable<TResult>> selector)
{
    using (var enumerator = source.GetEnumerator())
    {
        if(!enumerator.MoveNext())
            return new TResult[0];

        var ret = selector(enumerator.Current);

        while (enumerator.MoveNext())
        {
            ret = ret.Intersect(selector(enumerator.Current));
        }

        return ret;
    }
}

So the usage would be something like this:

var intersection = (new[] { list1, list2, list3 }).IntersectMany(l => l).ToList();
gigi
  • 686
  • 7
  • 20
2

This is my one-row solution for List of List (ListOfLists) without intersect function:

var intersect = ListOfLists.SelectMany(x=>x).Distinct().Where(w=> ListOfLists.TrueForAll(t=>t.Contains(w))).ToList()

This should work for .net 4 (or later)

Sergey
  • 447
  • 3
  • 7
0

After searching the 'net and not really coming up with something I liked (or that worked), I slept on it and came up with this. Mine uses a class (SearchResult) which has an EmployeeId in it and that's the thing I need to be common across lists. I return all records that have an EmployeeId in every list. It's not fancy, but it's simple and easy to understand, just what I like. For small lists (my case) it should perform just fine—and anyone can understand it!

private List<SearchResult> GetFinalSearchResults(IEnumerable<IEnumerable<SearchResult>> lists)
{
    Dictionary<int, SearchResult> oldList = new Dictionary<int, SearchResult>();
    Dictionary<int, SearchResult> newList = new Dictionary<int, SearchResult>();

    oldList = lists.First().ToDictionary(x => x.EmployeeId, x => x);

    foreach (List<SearchResult> list in lists.Skip(1))
    {
        foreach (SearchResult emp in list)
        {
            if (oldList.Keys.Contains(emp.EmployeeId))
            {
                newList.Add(emp.EmployeeId, emp);
            }
        }

        oldList = new Dictionary<int, SearchResult>(newList);
        newList.Clear();
    }

    return oldList.Values.ToList();
}

Here's an example just using a list of ints, not a class (this was my original implementation).

static List<int> FindCommon(List<List<int>> items)
{
    Dictionary<int, int> oldList = new Dictionary<int, int>();
    Dictionary<int, int> newList = new Dictionary<int, int>();

    oldList = items[0].ToDictionary(x => x, x => x);

    foreach (List<int> list in items.Skip(1))
    {
        foreach (int i in list)
        {
            if (oldList.Keys.Contains(i))
            {
                newList.Add(i, i);
            }
        }

        oldList = new Dictionary<int, int>(newList);
        newList.Clear();
    }

    return oldList.Values.ToList();
}
birdus
  • 6,102
  • 14
  • 55
  • 84
-1

This is a simple solution if your lists are all small. If you have larger lists, it's not as performing as hash set:

public static IEnumerable<T> IntersectMany<T>(this IEnumerable<IEnumerable<T>> input)
{
    if (!input.Any())
        return new List<T>();

    return input.Aggregate(Enumerable.Intersect);
}
harakim
  • 101
  • 1
  • 1