33

I have a list<message> that contains properties of type Guid and DateTime (as well as other properties). I would like to get rid of all of the items in that list where the Guid and DateTime are the same (except one). There will be times when those two properties will be the same as other items in the list, but the other properties will be different, so I can't just use .Distinct()

List<Message> messages = GetList();
//The list now contains many objects, it is ordered by the DateTime property

messages = from p in messages.Distinct(  what goes here? ); 

This is what I have right now, but it seems like there ought to be a better way

List<Message> messages = GetList();

for(int i = 0; i < messages.Count() - 1)  //use Messages.Count() -1 because the last one has nothing after it to compare to
{
    if(messages[i].id == messages[i+1}.id && messages[i].date == message[i+1].date)
    {
        messages.RemoveAt(i+1);
    {
    else
    {
         i++
    }
}
Uwe Keim
  • 36,867
  • 50
  • 163
  • 268
user1304444
  • 1,493
  • 2
  • 20
  • 38
  • 1
    http://stackoverflow.com/questions/489258/linq-distinct-on-a-particular-property – Shyju Aug 04 '12 at 18:39
  • Thanks. I don't know why I couldn't find that when I searched. – user1304444 Aug 04 '12 at 19:34
  • I'm glad Jon's answer worked for you. Just a note of caution: your "currently used method" doesn't compile, and (after fixing the compile errors) it will not work in all cases - depending on the order of your elements, you'd get different **(wrong)** results (after all, you're only comparing adjacent elements with each other). – Adam Aug 04 '12 at 19:59
  • thanks for the heads-up. GetList() returns an ordered List. I've tested different cases, and I get the result I need. – user1304444 Aug 06 '12 at 15:35
  • Possible duplicate of [LINQ's Distinct() on a particular property](https://stackoverflow.com/questions/489258/linqs-distinct-on-a-particular-property) – Jim G. Jul 22 '18 at 15:29

5 Answers5

84

LINQ to Objects doesn't provide this functionality easily in a built-in way, but MoreLINQ has a handy DistinctBy method:

messages = messages.DistinctBy(m => new { m.id, m.date }).ToList();
Simon
  • 30,844
  • 15
  • 120
  • 187
Jon Skeet
  • 1,261,211
  • 792
  • 8,724
  • 8,929
  • I'm assuming MoreLINQ is free to use? I don't see that explicitly written anywhere on the page. – user1304444 Aug 04 '12 at 19:32
  • 2
    @user1304444: It's an open source library - see the "Apache License 2.0" link on the left of the page. – Jon Skeet Aug 04 '12 at 19:33
  • 3
    For anyone else viewing this question, the link Shyju mentioned above seems to be a great answer also. http://stackoverflow.com/questions/489258/linq-distinct-on-a-particular-property – user1304444 Aug 04 '12 at 19:43
  • 4
    @user1304444: Yeah, I think it was around the time of writing that answer that I decided to start MoreLINQ :) – Jon Skeet Aug 04 '12 at 19:56
  • @JonSkeet Does this perform an 'either' property is distinct or 'together' (treating both as one property) they are distinct? – Paul Zahra May 06 '15 at 12:40
  • 1
    @PaulZahra: Together. Two items a and b will only be seen as equal if `a.id == b.id && a.date == b.date`. – Jon Skeet May 06 '15 at 13:16
  • Having discovered this answer, I now cannot live without `DistinctBy`! – Andrew Webb Oct 07 '20 at 16:39
  • [Here](https://github.com/morelinq/MoreLINQ/blob/e384ba07f13d7fd15609da5348dbf4523f93c1e1/MoreLinq/DistinctBy.cs#L65) is the source code of the `DistinctBy` method. It is small and copy-pastable. – Theodor Zoulias Nov 08 '20 at 07:44
17

Jon Skeet's DistinctBy is definitely the way to go, however if you are interested in defining your own extension method you might take fancy in this more concise version:

public static IEnumerable<TSource> DistinctBy<TSource, TKey>
(this IEnumerable<TSource> source, Func<TSource, TKey> keySelector)
{
    var known = new HashSet<TKey>();
    return source.Where(element => known.Add(keySelector(element)));
}

which has the same signature:

messages = messages.DistinctBy(x => new { x.id, x.date }).ToList();
Adam
  • 14,638
  • 2
  • 40
  • 63
  • 6
    I know this is old, but please note that you have to call `ToList()` or `ToArray()` after calling `DistinctBy()`. If you work directly on the `IEnumerable` and enumerate it multiple times it won't work, since the items are added to the `HashSet` while going through the `IEnumerable` the first time and won't be returned a second time, as shown in this [.NET Fiddle](https://dotnetfiddle.net/5PUJxl). – fknx Jul 18 '17 at 10:00
1

You can check out my PowerfulExtensions library. Currently it's in a very young stage, but already you can use methods like Distinct, Union, Intersect, Except on any number of properties;

This is how you use it:

using PowerfulExtensions.Linq;
...
var distinct = myArray.Distinct(x => x.A, x => x.B);
Andrzej Gis
  • 11,456
  • 12
  • 73
  • 118
1

Try this,

 var messages = (from g1 in messages.GroupBy(s => s.id) from g2 in g1.GroupBy(s => s.date) select g2.First()).ToList();
0

What about this?

var messages = messages
               .GroupBy(m => m.id)
               .GroupBy(m => m.date)
               .Select(m => m.First());
Andrew Church
  • 1,391
  • 11
  • 13
  • 1
    does not compile... Remember that GroupBy returns an `IGrouping`. – Adam Aug 04 '12 at 18:51
  • This approach is valid if HashSet is not available on the plateform you are developing like silverslight .... – Thomas Apr 05 '16 at 22:29