42

I'd like to partition a list into a list of lists, by specifying the number of elements in each partition.

For instance, suppose I have the list {1, 2, ... 11}, and would like to partition it such that each set has 4 elements, with the last set filling as many elements as it can. The resulting partition would look like {{1..4}, {5..8}, {9..11}}

What would be an elegant way of writing this?

Chris Gerken
  • 15,735
  • 6
  • 41
  • 58
David Hodgson
  • 9,544
  • 17
  • 53
  • 77

11 Answers11

59

Here is an extension method that will do what you want:

public static IEnumerable<List<T>> Partition<T>(this IList<T> source, Int32 size)
{
    for (int i = 0; i < (source.Count / size) + (source.Count % size > 0 ? 1 : 0); i++)
        yield return new List<T>(source.Skip(size * i).Take(size));
}

Edit: Here is a much cleaner version of the function:

public static IEnumerable<List<T>> Partition<T>(this IList<T> source, Int32 size)
{
    for (int i = 0; i < Math.Ceiling(source.Count / (Double)size); i++)
        yield return new List<T>(source.Skip(size * i).Take(size));
}
Andrew Hare
  • 320,708
  • 66
  • 621
  • 623
  • 3
    for (int i = 0; i < source.Count; i += size) { /* ... */ } – Roger Lipscombe Dec 30 '09 at 13:03
  • 1
    An unfortunate effect of this method is that the given array is not accessibly by index. There's a method here that returns a List instead http://www.vcskicks.com/partition-list.php – George Oct 19 '10 at 13:00
  • 7
    Be aware that in actual LINQ implementation, `Skip` and `Take` are simply looping on the given sequence, there is no check / optimization in case source is implementing `IList` and thus can be accessed by index. Because of that they are `O(m)` (where `m` is number of element you want to skip or to take) and this `Partition()` extension might not give expected performance. – tigrou Jun 26 '15 at 08:07
  • @George: (At least now) you can call `.ToList()` on the enumerable to get an indexable list. – mklement0 Nov 14 '18 at 20:58
32

Using LINQ you could cut your groups up in a single line of code like this...

var x = new List<int>() { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 };

var groups = x.Select((i, index) => new
{
    i,
    index
}).GroupBy(group => group.index / 4, element => element.i);

You could then iterate over the groups like the following...

foreach (var group in groups)
{
    Console.WriteLine("Group: {0}", group.Key);

    foreach (var item in group)
    {
        Console.WriteLine("\tValue: {0}", item);
    }
}

and you'll get an output that looks like this...

Group: 0
        Value: 1
        Value: 2
        Value: 3
        Value: 4
Group: 1
        Value: 5
        Value: 6
        Value: 7
        Value: 8
Group: 2
        Value: 9
        Value: 10
        Value: 11
Scott Ivey
  • 38,670
  • 20
  • 76
  • 116
  • 2
    Dosn't exact meet the requirements of the question, but +1 for thinking about it a little differently. – RichardOD Sep 08 '09 at 20:56
  • RichardOD - you're right - I updated the example so that the output is a group of ints rather than a group of anon types. – Scott Ivey Sep 08 '09 at 21:14
  • I think you just blew my mind. I'm really curious to know where you learned syntax like that (I really like it). All the LINQ docs I've seen are good -- but they don't cover grouping very well. – Dan Esparza Sep 09 '09 at 20:06
  • Lots of tinkering + reading SO questions. LINQ is definitely one of my favorite new features in 3.5 - and I've learned quite a bit about it just by hanging out here. This overload for GroupBy was something that I hadn't used before - so that was new to me as well :) – Scott Ivey Sep 09 '09 at 20:19
  • 1
    @ScottIvey very nice grouping logic and perfect for some logic I need for busting outbound UDP commands into multiple packets based on an internal List<>.Count(). good one! thanks for sharing. –  Nov 22 '13 at 13:43
11

Something like (untested air code):

IEnumerable<IList<T>> PartitionList<T>(IList<T> list, int maxCount)
{
    List<T> partialList = new List<T>(maxCount);
    foreach(T item in list)
    {
        if (partialList.Count == maxCount)
        {
           yield return partialList;
           partialList = new List<T>(maxCount);
        }
        partialList.Add(item);
    }
    if (partialList.Count > 0) yield return partialList;
}

This returns an enumeration of lists rather than a list of lists, but you can easily wrap the result in a list:

IList<IList<T>> listOfLists = new List<T>(PartitionList<T>(list, maxCount));
Joe
  • 114,633
  • 27
  • 187
  • 321
  • I like this solution but it might cause issues if large number is passed to maxCount (eg : `PartitionList(list, enablePartition ? 500 : int.MaxValue)` A possible improvement is to set list capacity only if source implement ICollection and clamp maxCount to number of elements inside the collection. – tigrou Apr 18 '18 at 09:25
  • @tigrou - I'm not sure I would protect a caller from the consequences of passing an excessively large number, but to be able to handle arbitrary large partitions you would probably use enumerations rather than lists - e.g. a method `IEnumerable> PartitionEnumeration (IEnumerable enumeration, int maxCount)` which could be implemented easily without allocating a list. – Joe Apr 18 '18 at 10:54
  • If you return `IEnumerable>` and rely on an implementation that never allocate anything (eg : it only yielding elements from source) you will be in trouble if result is not enumerated sequentially (eg : partition 4 is enumerated before partition 2 or some partitions are only partially enumerated). I think lists are safer. – tigrou Apr 18 '18 at 16:03
9

To avoid grouping, mathematics and reiteration.

The method avoids unnecessary calculations, comparisons and allocations. Parameter validation is included.

Here is a working demonstration on fiddle.

public static IEnumerable<IList<T>> Partition<T>(
    this IEnumerable<T> source,
    int size)
{
    if (size < 2)
    {
        throw new ArgumentOutOfRangeException(
            nameof(size),
            size,
            "Must be greater or equal to 2.");  
    }

    T[] partition;
    int count;

    using (var e = source.GetEnumerator())
    {
        if (e.MoveNext())
        {
            partition = new T[size];
            partition[0] = e.Current;
            count = 1;
        }
        else
        {
            yield break;    
        }

        while(e.MoveNext())
        {
            partition[count] = e.Current;
            count++;

            if (count == size)
            {
                yield return partition;
                count = 0;
                partition = new T[size];
            }
        }
    }

    if (count > 0)
    {
        Array.Resize(ref partition, count);
        yield return partition;
    }
}
Jodrell
  • 31,518
  • 3
  • 75
  • 114
  • Yours is the most elegant and the least resource consumption of all the possible solution, I dont know why it does not have more upvotes – Paleta Jan 07 '19 at 15:01
  • I like this, why do `ArgumentOutOfRangeException` for `1`? you can change that to `size < 1`, then add `if (size == 1) yield return partition; else count = 1;` in the `if (e.MoveNext()` block following assignment to `partition[0]`. – Brett Caswell Feb 09 '20 at 17:57
  • 1
    I did consider that but, if you want partitions smaller than 2, it's pretty wasteful to call the function: Just enumerate the list but, I accept it makes the function brittle, or is that informative. – Jodrell Feb 10 '20 at 21:24
  • thanks for relaying your thoughts, I presumed you pondered over it and concluded there was a better implementation for that scenario - and that it would be up to the calling function (scope of responsibility) to make the determination. – Brett Caswell Feb 11 '20 at 11:27
1
var yourList = new List<int> { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 };
var groupSize = 4;

// here's the actual query that does the grouping...
var query = yourList
    .Select((x, i) => new { x, i })
    .GroupBy(i => i.i / groupSize, x => x.x);

// and here's a quick test to ensure that it worked properly...
foreach (var group in query)
{
    foreach (var item in group)
    {
        Console.Write(item + ",");
    }
    Console.WriteLine();
}

If you need an actual List<List<T>> rather than an IEnumerable<IEnumerable<T>> then change the query as follows:

var query = yourList
    .Select((x, i) => new { x, i })
    .GroupBy(i => i.i / groupSize, x => x.x)
    .Select(g => g.ToList())
    .ToList();
LukeH
  • 242,140
  • 52
  • 350
  • 400
1

Or in .Net 2.0 you would do this:

    static void Main(string[] args)
    {
        int[] values = new int[] { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 };
        List<int[]> items = new List<int[]>(SplitArray(values, 4));
    }

    static IEnumerable<T[]> SplitArray<T>(T[] items, int size)
    {
        for (int index = 0; index < items.Length; index += size)
        {
            int remains = Math.Min(size, items.Length-index);
            T[] segment = new T[remains];
            Array.Copy(items, index, segment, 0, remains);
            yield return segment;
        }
    }
csharptest.net
  • 53,926
  • 10
  • 66
  • 86
1

Using ArraySegments might be a readable and short solution (casting your list to array is required):

var list = new List<int>() { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 }; //Added 0 in front on purpose in order to enhance simplicity.
int[] array = list.ToArray();
int step = 4;
List<int[]> listSegments = new List<int[]>();

for(int i = 0; i < array.Length; i+=step)
{
     int[] segment = new ArraySegment<int>(array, i, step).ToArray();
     listSegments.Add(segment);
}
1

I'm not sure why Jochems answer using ArraySegment was voted down. It could be really useful as long as you are not going to need to extend the segments (cast to IList). For example, imagine that what you are trying to do is pass segments into a TPL DataFlow pipeline for concurrent processing. Passing the segments in as IList instances allows the same code to deal with arrays and lists agnostically.

Of course, that begs the question: Why not just derive a ListSegment class that does not require wasting memory by calling ToArray()? The answer is that arrays can actually be processed marginally faster in some situations (slightly faster indexing). But you would have to be doing some fairly hardcore processing to notice much of a difference. More importantly, there is no good way to protect against random insert and remove operations by other code holding a reference to the list.

Calling ToArray() on a million value numeric list takes about 3 milliseconds on my workstation. That's usually not too great a price to pay when you're using it to gain the benefits of more robust thread safety in concurrent operations, without incurring the heavy cost of locking.

Ben Stabile
  • 173
  • 1
  • 4
1
public static IEnumerable<IEnumerable<T>> Partition<T>(this IEnumerable<T> list, int size)
{
    while (list.Any()) { yield return list.Take(size); list = list.Skip(size); }
}

and for the special case of String

public static IEnumerable<string> Partition(this string str, int size)
{
    return str.Partition<char>(size).Select(AsString);
}

public static string AsString(this IEnumerable<char> charList)
{
    return new string(charList.ToArray());
}
Scroog1
  • 3,194
  • 20
  • 24
0

You could use an extension method:

public static IList<HashSet<T>> Partition<T>(this IEnumerable<T> input, Func<T, object> partitionFunc)
{
      Dictionary<object, HashSet> partitions = new Dictionary<object, HashSet<T>>();

  object currentKey = null;
  foreach (T item in input ?? Enumerable.Empty<T>())
  {
      currentKey = partitionFunc(item);

      if (!partitions.ContainsKey(currentKey))
      {
          partitions[currentKey] = new HashSet<T>();
      }

      partitions[currentKey].Add(item);
  }

  return partitions.Values.ToList();

}

Lee
  • 133,981
  • 18
  • 209
  • 268
0

To avoid multiple checks, unnecessary instantiations, and repetitive iterations, you could use the code:

namespace System.Collections.Generic
{
    using Linq;
    using Runtime.CompilerServices;

    public static class EnumerableExtender
    {
        [MethodImpl(MethodImplOptions.AggressiveInlining)]
        public static bool IsEmpty<T>(this IEnumerable<T> enumerable) => !enumerable?.GetEnumerator()?.MoveNext() ?? true;

        public static IEnumerable<IEnumerable<T>> Partition<T>(this IEnumerable<T> source, int size)
        {
            if (source == null)
                throw new ArgumentNullException(nameof(source));
            if (size < 2)
                throw new ArgumentOutOfRangeException(nameof(size));
            IEnumerable<T> items = source;
            IEnumerable<T> partition;
            while (true)
            {
                partition = items.Take(size);
                if (partition.IsEmpty())
                    yield break;
                else
                    yield return partition;
                items = items.Skip(size);
            }
        }
    }
}