12

When I use GroupBy as part of a LINQ query to EFCore, I get the error System.InvalidOperationException: Client-side GroupBy is not supported.

This is because EF Core 3.1 attempts to evaluate queries on the server-side as much as possible, as opposed to evaluating them on the client-side, and the call cannot be translated to SQL.

So the following statement does not work, and produces the error mentioned above:

var blogs = await context.Blogs
    .Where(blog => blog.Url.Contains("dotnet"))
    .GroupBy(t => t.BlobNumber)
    .Select(b => b)
    .ToListAsync();

Now apparently the solution is to use.AsEnumerable() or .ToList() before the call to GroupBy(), as that explicitly tells EF Core that you want to do the grouping client side. There is a discussion about this on GitHub and in the Microsoft docs.

var blogs = context.Blogs
    .Where(blog => blog.Url.Contains("dotnet"))
    .AsEnumerable()
    .GroupBy(t => t.BlobNumber)
    .Select(b => b)
    .ToList();

However, this is not asynchronous. How can I make it asynchronous?

If I change AsEnumerable() to AsAsyncEnumerable(), I get an error. If I instead try to change AsEnumerable() to ToListAsync() then the GroupBy() command fails.

I am thinking of wrapping it in a Task.FromResult, but would this actually be asynchronous? Or is the database query still synchronous and only the subsequent grouping is asynchronous?

var blogs = await Task.FromResult(context.Blogs
    .Where(blog => blog.Url.Contains("dotnet"))
    .AsEnumerable()
    .GroupBy(t => t.BlobNumber)
    .Select(b => b)
    .ToList());

Or if that doesn't work is there another way?

Nkosi
  • 191,971
  • 29
  • 311
  • 378
Gary
  • 1,275
  • 14
  • 28
  • Even if that wasn’t there the issue seems to remain. (The example above is just one I threw together copying and pasting from elsewhere. My actual code does not use that function.) – Gary Jan 29 '20 at 11:37
  • Ok You should explicitly mention that. I'll remove my comment. then. – Nkosi Jan 29 '20 at 11:38
  • Either way though, I figure if using AsEnumerable solves the problem so easily, then there must be another easy way to do it but asynchronously. I just cannot figure out what it is. – Gary Jan 29 '20 at 11:40
  • I've simplified the example to get rid of that function. – Gary Jan 29 '20 at 11:44
  • Shortly, `AsAsyncEnumerable()` + `System.Linq.Async` package (and `AsQueryable()` for `DbSet`s inside if any) – Ivan Stoev Jan 29 '20 at 12:09
  • @IvanStoev there's no reason, the OP isn't really grouping anything. The query is loading all detail rows and batching them, it doesn't aggregate anything. In fact, shouldn't that `Select(b=>b)` *ungroup* the groups? – Panagiotis Kanavos Jan 29 '20 at 12:10
  • @PanagiotisKanavos `Select(b => b)` does nothing and can be skipped (not essential for the issue). The problem is what to put **before** `GroupBy` and let the code after compile and execute client side. – Ivan Stoev Jan 29 '20 at 12:13
  • @IvanStoev right, it would need SelectMany to flatten. In that case, GroupBy is pointless too. The entire operation is a `ToLookup(t => t.BlobNumber)` – Panagiotis Kanavos Jan 29 '20 at 12:21
  • 1
    @PanagiotisKanavos Again, the concreted `Select` in the sample query is not essential for the question asked, which is *"How to use **GroupBy** in an **asynchronous manner** in EF Core 3.1?"* – Ivan Stoev Jan 29 '20 at 12:25
  • @Gary you can use the System.Linq.Async package and `AsAsyncEnumerable()` to get real client-side async grouping. EF isn't involved here, you aren't *grouping* data in the SQL sense. You're loading individual rows and putting them in buckets by BlobNumber. You can speed this up with parallelization after loading, or by using the async LINQ operators – Panagiotis Kanavos Jan 29 '20 at 12:55

2 Answers2

7

I think the only way you have is just to do it something like this

var blogs = await context.Blogs
    .Where(blog => blog.Url.Contains("dotnet"))
    .ToListAsync();

var groupedBlogs = blogs.GroupBy(t => t.BlobNumber).Select(b => b).ToList();

Because GroupBy will be evaluated at client anyway

Nkosi
  • 191,971
  • 29
  • 311
  • 378
Denis Kuliev
  • 121
  • 4
3

This query isn't trying to group data in the SQL/EF Core sense. There are no aggregations involved.

It's loading all detail rows and then batching them into different buckets on the client. EF Core isn't involved in this, this is a purely client-side operation. The equivalent would be :

var blogs = await context.Blogs
    .Where(blog => blog.Url.Contains("dotnet"))
    .ToListAsync();

var blogsByNum = blogs.ToLookup(t => t.BlobNumber);

Speeding up grouping

The batching/grouping/lookup operation is purely CPU bound, so the only way to accelerate it would be to parallelize it, ie use all CPUs to group the data eg :

var blogsByNum = blogs.AsParallel()
                      .ToLookup(t => t.BlobNumber);

ToLookup does more or less that GroupBy().ToList() does - it groups the rows into buckets based on a key

Grouping while loading

A different approach would be to load the results asynchronously and put them into buckets as they arrive. To do that, we need AsAsyncEnumerable(). ToListAsync() returns all the results at once, so it can't be used.

This approach is quite similar to what ToLookup does.


var blogs = await context.Blogs
    .Where(blog => blog.Url.Contains("dotnet"));

var blogsByNum=new Dictionary<string,List<Blog>>();

await foreach(var blog in blogs.AsAsyncEnumerable())
{
    if(blogsByNum.TryGetValue(blog.BlobNumber,out var blogList))
    {
        blogList.Add(blog);
    }
    else
    {
        blogsByNum[blog.BlobNumber=new List<Blog>(100){blog};
    }
}

The query is executed by the call to AsAsyncEnumerable(). The results arrive asynchronously though, so now we can add them to buckets while iterating.

The capacity parameter is used in the list constructor to avoid reallocations of the list's internal buffer.

Using System.LINQ.Async

Things would be a lot easier if we had LINQ operations for IAsyncEnumerable<> itself. This extension namespace provides just that. It's developed by the ReactiveX team. It's available through NuGet and the current major version is 4.0.

With this, we could just write :

var blogs = await context.Blogs
    .Where(blog => blog.Url.Contains("dotnet"));

var blogsByNum=await blogs.AsAsyncEnumerable()   individual rows asynchronously
                          .ToLookupAsync(blog=>blog.BlobNumber);

Or

var blogsByNum=await blogs.AsAsyncEnumerable()   
                          .GroupBy(blog=>blog.BlobNumber)
                          .Select(b=>b)
                          .ToListAsync();
Panagiotis Kanavos
  • 90,087
  • 9
  • 138
  • 171
  • 1
    Why reopening to just post a duplicate answer (`AsAsyncEnumerable()` + `System.LINQ.Async` package)? The first part of this answer doesn't make sense in this context. The question is (and the whole GitHub debate) what is the pattern of forcing async client evaluation in EF Core query. For synchronous LINQ it's simple `AsEnumerable()` and the rest goes through `Enumerable` LINQ. – Ivan Stoev Jan 29 '20 at 13:13
  • Thanks for the information. I considered going with System.LINQ.Async, but then ended up going with Denis' suggestion since it required one less dependency and worked well for my simple use case. Will keep this package in mind for the future use though. – Gary Feb 01 '20 at 22:23