-1

Assume we have this records:

NID    CId    PushedAt
120    796    2015-09-04 18:00:53.6012627 +00:00
120    967    2015-09-04 18:00:51.9891748 +00:00
119    669    2015-09-04 17:45:56.8179094 +00:00
119    955    2015-09-04 17:45:55.2078154 +00:00
119    100    2015-09-04 17:45:53.5867187 +00:00
116    384    2015-09-04 17:01:01.5375630 +00:00
116    155    2015-09-04 17:00:59.9284665 +00:00
116    517    2015-09-04 17:00:58.3193725 +00:00
113    109    2015-09-04 16:00:53.5269438 +00:00
113    111    2015-09-04 16:00:51.9168442 +00:00
107    603    2015-09-04 13:45:59.9994496 +00:00

I want to group them by time-range (not a certain time). If I group them by time:

var grouped = list.GroupBy(t => new {
    t.PushedAt.Year, 
    t.PushedAt.Month, 
    t.PushedAt.Day, 
    t.PushedAt.Hour, 
    t.PushedAt.Minute
});

then I'll miss groups wich have different Minute, but actualy are in same group. For example, these rows:

116    384    2015-09-04 17:01:01.5375630 +00:00
116    155    2015-09-04 17:00:59.9284665 +00:00
116    517    2015-09-04 17:00:58.3193725 +00:00

will go to these groups:

// group 1:
116    384    2015-09-04 17:01:01.5375630 +00:00
// group 2:
116    155    2015-09-04 17:00:59.9284665 +00:00
116    517    2015-09-04 17:00:58.3193725 +00:00

But, what I'm looking for, is this group:

// group 1:
116    384    2015-09-04 17:01:01.5375630 +00:00
116    155    2015-09-04 17:00:59.9284665 +00:00
116    517    2015-09-04 17:00:58.3193725 +00:00

Means, these 3 rows should grouped together. Say, all rows which are in a 5 minutes range, should be grouped together. A full output would be something like this:

// group 1:
120    796    2015-09-04 18:00:53.6012627 +00:00
120    967    2015-09-04 18:00:51.9891748 +00:00
// group 2:
119    669    2015-09-04 17:45:56.8179094 +00:00
119    955    2015-09-04 17:45:55.2078154 +00:00
119    100    2015-09-04 17:45:53.5867187 +00:00
// group 3:
116    384    2015-09-04 17:01:01.5375630 +00:00
116    155    2015-09-04 17:00:59.9284665 +00:00
116    517    2015-09-04 17:00:58.3193725 +00:00
// group 4:
113    109    2015-09-04 16:00:53.5269438 +00:00
113    111    2015-09-04 16:00:51.9168442 +00:00
// group 5:
107    603    2015-09-04 13:45:59.9994496 +00:00

Do you have any idea?

Note: The NID field is NOT group-able.

UPDATE:

I know I can solve the problem by iterating items (as juharr said in comment). But, I'm looking for a LINQ solution, if there is any. Thanks.

amiry jd
  • 18,602
  • 10
  • 58
  • 127
  • 1
    I don't understand how 13:45.59/.../96 qualifies for the last group? Is that intentional? – Daniel Hoffmann-Mitscherling Sep 04 '15 at 18:42
  • Seems like you'd first order by the date time then loop through and set the first range as starting with the first entry's time to its time + 5 mins. Then keep pulling until an entry is out of the range and define the next range as starting with that entry's time and repeat. – juharr Sep 04 '15 at 18:48
  • @DanielHoffmann-Mitscherling no it was my mistake. I updated question. Thanks for the point ;) – amiry jd Sep 04 '15 at 18:49
  • @juharr yes that is the way; but I'm looking for a LINQ solution. I'll mention that in question. Thanks to the point. – amiry jd Sep 04 '15 at 18:50
  • 1
    @Javad_Amiry I don't think there is a LINQ solution. `GroupBy` needs a key that can be defined for each entry, but your "key" changes based on the set of entries. LINQ is not always the solution. – juharr Sep 04 '15 at 18:53
  • 1
    Unfortunately, your specification is ambiguous. For example, are you trying to partition the records into five-minute intervals? Or do you want any record within five minutes of another to be in the same group with that other? If the latter, do you care that following simply that could (depending on the data) result in _all_ records winding up in the same group, even those days apart? You may be able to get LINQ to group the records, but first you have to define a precise specification that allows you to compute a unique, deterministic key for each record based on the group it should be in. – Peter Duniho Sep 04 '15 at 19:12
  • You can create a class for the grouped data and implement a custom IEqualityComparer and pass it to the GroupBy. In the IEqualityComparer compare the delta of the minutes `Math.Abs(a.Minutes - b.Minutes) <= 5` as part of the equality condition. This way minutes in the same *range* will be both put in the same group. – keenthinker Sep 04 '15 at 19:18
  • Do you mean you want to group all items within discrete five minute boundaries (i.e. 17:00 -17:04:59.999, 17:05-17:09:59.999, etc) or group values that occurred within 5 minutes of another value? The latter one could have a group of values that span a time period greater than 5 minutes because of many overlapping records. – Enigmativity Sep 05 '15 at 07:13

3 Answers3

1

I think something like this may help:

var list = new List<myClass>();
list.Add(new myClass(120, 796, new DateTime(2015, 09, 04, 18, 00, 53)));
list.Add(new myClass(120, 967, new DateTime(2015, 09, 04, 18, 03, 51)));
list.Add(new myClass(119, 669, new DateTime(2015, 09, 04, 17, 45, 56)));
list.Add(new myClass(119, 955, new DateTime(2015, 09, 04, 17, 42, 55)));
list.Add(new myClass(119, 100, new DateTime(2015, 09, 04, 17, 41, 53)));
list.Add(new myClass(116, 384, new DateTime(2015, 09, 04, 17, 01, 01)));
list.Add(new myClass(116, 155, new DateTime(2015, 09, 04, 17, 00, 59)));
list.Add(new myClass(116, 517, new DateTime(2015, 09, 04, 17, 00, 58)));
list.Add(new myClass(113, 109, new DateTime(2015, 09, 04, 16, 02, 53)));
list.Add(new myClass(113, 111, new DateTime(2015, 09, 04, 16, 00, 51)));
list.Add(new myClass(107, 603, new DateTime(2015, 09, 04, 13, 45, 59)));

var grouped = list.GroupBy(t =>
    t.PushedAt.ToString("yyyyMMddHH") +
    ((int)(t.PushedAt.Minute / 5)).ToString("00")
);

foreach (var g in grouped) {
    Console.WriteLine(g.Key);
    foreach (var itm in g) {
        Console.WriteLine(String.Format("{0}\t{1}\t{2}", itm.CId, itm.NID, itm.PushedAt));
    }
}

Console result:

201509041800
796     120     9/4/2015 6:00:53 PM
967     120     9/4/2015 6:03:51 PM
201509041709
669     119     9/4/2015 5:45:56 PM
201509041708
955     119     9/4/2015 5:42:55 PM
100     119     9/4/2015 5:41:53 PM
201509041700
384     116     9/4/2015 5:01:01 PM
155     116     9/4/2015 5:00:59 PM
517     116     9/4/2015 5:00:58 PM
201509041600
109     113     9/4/2015 4:02:53 PM
111     113     9/4/2015 4:00:51 PM
201509041309
603     107     9/4/2015 1:45:59 PM
Nildarar
  • 772
  • 1
  • 6
  • 17
1

This is really quite easy to do using the .Ticks property.

If you start with the input from your question:

var records = new[]
{
    new { NID = 120, PID = 796, PushedAt = DateTime.Parse("2015-09-04 18:00:53.6012627") },
    new { NID = 120, PID = 967, PushedAt = DateTime.Parse("2015-09-04 18:00:51.9891748") },
    new { NID = 119, PID = 669, PushedAt = DateTime.Parse("2015-09-04 17:45:56.8179094") },
    new { NID = 119, PID = 955, PushedAt = DateTime.Parse("2015-09-04 17:45:55.2078154") },
    new { NID = 119, PID = 100, PushedAt = DateTime.Parse("2015-09-04 17:45:53.5867187") },
    new { NID = 116, PID = 384, PushedAt = DateTime.Parse("2015-09-04 17:01:01.5375630") },
    new { NID = 116, PID = 155, PushedAt = DateTime.Parse("2015-09-04 17:00:59.9284665") },
    new { NID = 116, PID = 517, PushedAt = DateTime.Parse("2015-09-04 17:00:58.3193725") },
    new { NID = 113, PID = 109, PushedAt = DateTime.Parse("2015-09-04 16:00:53.5269438") },
    new { NID = 113, PID = 111, PushedAt = DateTime.Parse("2015-09-04 16:00:51.9168442") },
    new { NID = 107, PID = 603, PushedAt = DateTime.Parse("2015-09-04 13:45:59.9994496") },
};

Then here is how to do the grouping:

var results =
    records
        .GroupBy(x => x.PushedAt.Ticks / TimeSpan.TicksPerMinute / 5);

I get these results:

results

Enigmativity
  • 97,521
  • 11
  • 78
  • 153
-1

Based on pasty's comment, I figured-out which I can pass an IEqualityComparer<> to GroupBy method. So, I did this:

var grouped = list.GroupBy(t => t.PushedAt, new MyComparer());

With this comparer:

internal class MyComparer : IEqualityComparer<DateTime> {

    private static readonly TimeSpan Span = TimeSpan.FromMinutes(5);

    public bool Equals(DateTime x, DateTime y){
        return (x - y).Duration() <= Span;
    }

    public int GetHashCode(DateTime obj) {
        return obj.Year.GetHashCode() ^ obj.Month.GetHashCode() ^ obj.Day.GetHashCode();
    }

}

This gives me exactly what I'm looking for.

Community
  • 1
  • 1
amiry jd
  • 18,602
  • 10
  • 58
  • 127
  • No, this is not correct. The rules for IEqualityComparer are that if two items are equal, they have the same hash code. You cannot ensure this here. That this works in some cases does not mean it is actually a useful solution. You are just setting yourself up for a really hard to find bug later down the road when your data is a bit different. You need to, as I mentioned in my comment on the question, go back to the drawing board and figure out an actual _specification_ to describe the behavior you want. At the moment, you don't have a specific enough problem statement. – Peter Duniho Sep 04 '15 at 22:25
  • @PeterDuniho Yep :( you're right. How about returning 0 (or any other consts) from all GetHashCode? `public int GetHashCode(DateTime obj) { return 0; }` – amiry jd Sep 05 '15 at 06:35
  • You can, but then you'll have awful hash table performance. Also, that only fixes the hash code aspect. Another rule for equality is that if A == B and B == C, then A == C. Your comparer here also cannot guarantee that. The bottom line is that until you improve the question such that you've provided a precise specification, you're not going to get anywhere with a solution. Note that you've got two different suggestions interpreting your question one particular way (i.e. partition into 5-minute intervals), but it's not clear that way is what you want. – Peter Duniho Sep 05 '15 at 07:29