6

I've got a data set like this:

GroupName   GroupValue   MemberName   MemberValue
'Group1'    1            'Member1'    1
'Group1'    1            'Member2'    2
'Group2'    2            'Member3'    3
'Group2'    2            'Member4'    2
'Group3'    2            'Member5'    4
'Group3'    2            'Member6'    1

What I want to select is the rows that have the maximum MemberValue per GroupName, but only for those GroupNames that have the largest GroupValue, and pass them into a delegate function. Like this:

'Group2'    2            'Member3'    3
'Group3'    2            'Member5'    4

So far I've tried this format...

data.Where(maxGroupValue => 
    maxGroupValue.GroupValue == data.Max(groupValue => groupValue.GroupValue))
.Select(FunctionThatTakesData)

...but that just gives me every member of Group2 and Group3. I've tried putting a GroupBy() before the Select(), but that turns the output into an IGrouping<string, DataType> so FunctionThatTakesData() doesn't know what to do with it, and I can't do another Where() to filter out only the maximum MemberValues.

What can I do to get this data set properly filtered and passed into my function?

JAF
  • 333
  • 2
  • 10

2 Answers2

9

You can do that with the following Linq.

var results = data.GroupBy(r = r.GroupValue)
    .OrderByDescending(g => g.Key)
    .FirstOrDefault()
    ?.GroupBy(r => r.GroupName)
    .Select(g => g.OrderByDescending(r => r.MemberValue).First());

First you have to group on the GroupValue then order the groups in descending order by the Key (which is the GroupValue) and take the first one. Now you have all the rows with the max GroupValue. Then you group those on the GroupName and from those groups order the MemberValue in descending order and take the First row to get the row in each GroupName group with the max MemberValue. Also I'm using the C# 6 null conditional operator ?. after FirstOrDefault in case data is empty. If you're not using C# 6 then you'll need to handle that case up front and you can just use First instead.

juharr
  • 30,127
  • 4
  • 48
  • 88
  • Thank you! Now, suppose each grouping can have multiple members with the same value... should I just use a `Where()` in place of the `First()`? – JAF May 05 '17 at 13:24
  • 1
    In that case you'd want `.SelectMany(g => g.GroupBy(r => r.MemberValue).OrderByDescending(sg => sg.Key).First())` to replace the last line. – juharr May 05 '17 at 15:16
  • Genius! Thank you! – JAF May 05 '17 at 17:24
2

So basically what you want, is to divide your data elements into groups with the same value for GroupName. From every group you want to take one element, namely the one with the largest value for property MemberValue.

Whenever you have a sequence of items, and you want to divide this sequence into groups based on the value of one or more properties of the items in the sequence you use Enumerable.GroupBy

'GroupBy' takes your sequence as input and an extra input parameter: a function that selects which properties of your items you want to compare in your decision in which group you want the item to appear.

In your case, you want to divide your sequence into groups where all elements in a group have the same GroupName.

var groups = mySequence.GroupBy(element => element.GroupName);

What it does, it takes from every element in mySequence the property GroupName, and puts this element into a group of elements that have this value of GroupName.

Using your example data, you'll have three groups:

  • The group with all elements with GroupName == "Group1". The first two elements of your sequence will be in this group
  • The group with all elements with GroupName == "Group2". The third and fourth element of your sequence will be in this group
  • The group with all elements with GroupName == "Group3". The last two elements of your sequence will be in this group

Each group has a property Key, containing your selection value. This key identifies the group and is guaranteed to be unique within your collection of groups. So you'll have a group with Key == "Group1", a group with Key == "Group2", etc.

Besides the Key, every group is a sequence of the elements in the group (note: the group IS an enumerable sequence, not: it HAS an enumerable sequence.

Your second step would be to take from every group the element in the group with the largest value for MemberValue. For this you would order the elements in the group by descending value for property MemberValue and take the first one.

var myResult = mySequence.GroupBy(element => element.GroupName)
    // intermediate result: groups where all elements have the same GroupName
    .Select(group => group.OrderByDescending(groupElement => groupElement.MemberValue)
    // intermediate result: groups where all elements are ordered in descending memberValue
    .First();

Result: from every group ordered by descending memberValue, take the first element, which should be the largest one.

It is not very efficient to order the complete group, if you only want the element with the largest value for memberValue. The answer for this can be found here on StackOverflow

Harald Coppoolse
  • 24,051
  • 6
  • 48
  • 92
  • That gives you each `GroupName` with the max `MemberValue`. The OP wants only the rows with the max `GroupValue` and then each `MemberName` with the max `MemberValue`. – juharr May 05 '17 at 13:00