20

I'm updating an EF6.x project to EF Core 3.1. Decided to go back to basics and follow the example of how to set up relationships from scratch again.

According to the official Microsoft documentation, EF Core Relationship Examples, I translated the examples into a console app below:

using Microsoft.EntityFrameworkCore;
using System.Collections.Generic;
using System.Linq;
using System.Threading.Tasks;

namespace BlogPostsExample
{
    class Program
    {
        async static Task Main(string[] args)
        {
            // SQL Running in a Docker container - update as required
            var conString = "data source=localhost,14330;initial catalog=BlogsDb;persist security info=True;user id=sa;password=<Your super secure SA password>;MultipleActiveResultSets=True;App=EntityFramework;";

            var ctx = new MyContext(conString);

            await ctx.Database.EnsureCreatedAsync();

            var result = await ctx.Posts.GroupBy(p => p.Blog).ToArrayAsync();

        }
    }

    class MyContext : DbContext
    {
        private readonly string _connectionString;

        public MyContext(string connectionString)
        {
            _connectionString = connectionString;
        }
        protected override void OnConfiguring(DbContextOptionsBuilder optionsBuilder)
        {
            base.OnConfiguring(optionsBuilder);
            if (!optionsBuilder.IsConfigured)
            {
                optionsBuilder
                .UseSqlServer(_connectionString);
            }
        }
        public DbSet<Blog> Blogs { get; set; }
        public DbSet<Post> Posts { get; set; }

        protected override void OnModelCreating(ModelBuilder modelBuilder)
        {

            modelBuilder.Entity<Post>()
            .HasOne(p => p.Blog)
            .WithMany(b => b.Posts)
            .HasForeignKey(p => p.BlogId) //Tried with and without these keys defined.
            .HasPrincipalKey(b => b.BlogId);
        }

    }
    public class Blog
    {
        public int BlogId { get; set; }
        public string Url { get; set; }

        public List<Post> Posts { get; set; }
    }

    public class Post
    {
        public int PostId { get; set; }
        public string Title { get; set; }
        public string Content { get; set; }

        public int BlogId { get; set; }
        public Blog Blog { get; set; }
    }
}

There is no data in the DB. EF Core fails to convert

ctx.Posts.GroupBy(p => p.Blog)  

to a store query. This appears to me to be the simplest example of a GroupBy you could try.

When you run this code you get the following exception:

System.InvalidOperationException: 'The LINQ expression 'DbSet<Post>
    .Join(
        outer: DbSet<Blog>, 
        inner: p => EF.Property<Nullable<int>>(p, "BlogId"), 
        outerKeySelector: b => EF.Property<Nullable<int>>(b, "BlogId"), 
        innerKeySelector: (o, i) => new TransparentIdentifier<Post, Blog>(
            Outer = o, 
            Inner = i
        ))
    .GroupBy(
        source: p => p.Inner, 
        keySelector: p => p.Outer)' could not be translated. Either rewrite the query in a form that can be translated, or switch to client evaluation explicitly by inserting a call to either AsEnumerable(), AsAsyncEnumerable(), ToList(), or ToListAsync(). See https://go.microsoft.com/fwlink/?linkid=2101038 for more information.'

The only way to get this to work is to add something like AsEnumerable() before the GroupBy.

This is clearly not great from a performance point of view, it turns the group by operation into a client side operation where you really want to be doing grouping on the server side.

Have I missed something glaringly obvious? I struggle to believe that EF Core can't do the simplest group by that EF Framework has been doing since day 1. This seems like a fundamental requirement of any data driven app? (Or any app with a modicum of data!)

Update: enter image description here

Adding a property, such as the Primary Key for the Blog in question makes no difference.

Update 2:

If you follow this JetBrains article, you can do this:

var ctx = new EntertainmentDbContext(conString);
await ctx.Database.EnsureCreatedAsync();

var dataTask = ctx
                .Ratings
                .GroupBy(x => x.Source)
                .Select(x => new {Source = x.Key, Count = x.Count()})
                .OrderByDescending(x => x.Count)
                .ToListAsync();

var data = await dataTask;

But NOT this:

var ctx = new EntertainmentDbContext(conString);
await ctx.Database.EnsureCreatedAsync();

var dataTask = ctx
                .Ratings
                .GroupBy(x => x.Source)
                // .Select(x => new {Source = x.Key, Count = x.Count()})
                // .OrderByDescending(x => x.Count)
                .ToListAsync();

var data = await dataTask;

It only works with with an aggregating function, eg Count as above.

Something similar in SQL works

SELECT COUNT(R.Id), R.Source
FROM 
    [EntertainmentDb].[dbo].[Ratings] R
GROUP BY R.Source

But, removing the aggregating function, COUNT does not, you receive messages similar to:

Column 'EntertainmentDb.dbo.Ratings.Id' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.

So it looks like I am trying to ask EF Core a question that I cannot ask in TSQL

Ian Robertson
  • 1,841
  • 1
  • 21
  • 34
  • 4
    Have you tried ctx.Posts.GroupBy(p => p.Blog.BlogId)? p.Blog "references" an entity (table in de DB), whereas p.Blog.BlogId references a property (column in the DB). – Pepelui360 Jan 20 '20 at 10:57
  • I did try that on my real application, and I just tried it on this super simple example, and you get System.InvalidOperationException: 'Client side GroupBy is not supported.' – Ian Robertson Jan 20 '20 at 11:08
  • During the life of my app I started with EF Framework mapping results of stored procedures to Entities as I could easily write the query in TSQL, but put the effort into converting it all to LINQ with EF, now I feel like I might have to go back to stored procedures - assuming that still works with EF Core! – Ian Robertson Jan 20 '20 at 11:11
  • As @Pepelui360 said, the id of the entity make more sense for a groupBy than the references of the objects coming from a dbContext. – Mackhdo Jan 20 '20 at 11:12
  • 3
    See https://docs.microsoft.com/en-us/ef/core/querying/complex-query-operators#groupby for some explanation of why such `GroupBy` queries are not supported. SQL has no such query equivalent, hence the grouping has to be performed client side, and they want you to realize that and do it explicitly (with `AsEnumerable()` or similar). Here is the GitHub issue/discussion https://github.com/dotnet/efcore/issues/17068 if you want to join/vote. – Ivan Stoev Jan 20 '20 at 11:42
  • @IvanStoev - TSQL and SQL Server do support GroupBy. As I said in my question this worked in 6.x with no issues translating this to a store query. This is an issue with EF Core. One thing I have never tried is using an aggregate inside the GroupBy in LINQ - I have never needed it, but could try I guess... – Ian Robertson Jan 20 '20 at 11:47
  • I tried this var query = from p in ctx.Posts group p by p.Blog.BlogId into g select g.Key; var result1 = await query.ToArrayAsync(); this worked! The equivalent in Chained LINQ await ctx.Posts.GroupBy(p => new { p.Blog.BlogId }).ToArrayAsync(); DOESNT work!? – Ian Robertson Jan 20 '20 at 11:59
  • 2
    I think the correct equivalent chained LINQ would be ctx.Posts.GroupBy(p => new { p.Blog.BlogId }).Select(g=>g.Key).ToArrayAsync(). Since version 3 in EF Core you must use the select method. See https://docs.microsoft.com/en-us/ef/core/what-is-new/ef-core-3.0/breaking-changes#linq-queries-are-no-longer-evaluated-on-the-client – Pepelui360 Jan 20 '20 at 12:28
  • 2
    @IanRobertson SQL `GROUP BY` supports `SELECT`-ing only key and aggregates. While LINQ `GroupBy` allows selecting keys and lists of correlated items. This ha no SQL equivalent and is the reasoning EF Core 3.0/3.1 designers are not willing to support it. EF6 is different framework, so what it supports and what doesn't is irrelevant. All we can do is to join the discussion/voting and somehow convince EF Core team to change their current decision. – Ivan Stoev Jan 20 '20 at 12:45
  • 2
    I just tried running this sample with EF Core 5.0 as-is and get the same/similar exceptions – Ian Robertson Dec 04 '20 at 16:32
  • @IanRobertson the very logic of this query is wrong. SQL doesn't group by table names, it groups by specific fields. In SQL, the table isn't an entity, it's the *relation* between the values stored in the fields. `GroupBy(p => p.Blog) ` is meaningless. It's like trying to write `GROUP BY tbBlog` in SQL – Panagiotis Kanavos Dec 05 '20 at 17:44
  • @IanRobertson if EF 5 supported this - bad for the designers. That was a very, very, VERY bad idea, that used "magic" to pick the primary key for grouping, even if some *other* field would be more appropriate. What if only the blog title is included in the final results? Wouldn't you want to group by `Title` in that case? Magic and opaque library decisions are bad – Panagiotis Kanavos Dec 05 '20 at 17:47
  • 1
    @IanRobertson besides, this `ctx.Posts.GroupBy(p => p.Blog).ToArrayAsync()` is meaningless in SQL. This groups nothing, it *has* to return all `Post` rows and shape them in two levels by `Blog`. I'm pretty sure this never worked in EF 5/6 either, as it requires client-side evaluation, something that never worked on EF by design. – Panagiotis Kanavos Dec 05 '20 at 17:54
  • @PanagiotisKanavos EF6 allowed a query like this. It just groups entities by another entity. EF6 actually did client-side grouping to implement this type of grouping. It just didn't allow client-side evaluation of user code. – Gert Arnold Dec 07 '20 at 11:32
  • @GertArnold I wonder why this unfortunate feature was added. On the one hand, EF wasn't supposed to use client-side evaluation at all. Except for this one case, which obviously made people think it did something it wasn't supposed to do. PS: I was in the MVP summit where they announced EF 1.0 and dev MVPs almost tore the dev team to pieces. Next year, it was the SQL MVPs that picked up the pitchforks and joined the devs. – Panagiotis Kanavos Dec 07 '20 at 11:39
  • @PanagiotisKanavos Sure, EF has an "interesting" history. I guess there's no other option than client-side grouping if GroupBy is to be supported for other cases than only aggregates, because SQL can't return multi-level result sets. – Gert Arnold Dec 07 '20 at 12:39

2 Answers2

-1

I think this may be an inappropriate use of GroupBy since you're not actually grouping by something new or aggregating data, instead use the existing relationship and select the Blogs and include posts.

Note: Untested Code

var blogs = ctx.Blogs.Include(x => x.Posts);
// Optional filters.
var blogsWithPosts = ctx.Blogs
    .Include(x => x.Posts)
    .Where(x => x.Posts.Any())  // Only blogs with posts
    .Where(x => x.Posts.Any(y => y.Title == "A Specific Title")) // Only blogs with posts that have the title "A Specific Title"

If you need to include only a subset of a blogs posts, you can do that too.

var blogsAndMathingPosts = ctx.Blogs
    .Where(x => x.Posts.Any(y => y.Title == "A Specific Title")) // Only blogs that have at least one post with "A Specific Title"
    .Select(x => new Blog() {
        BlogId = x.BlogId,
        Url = x.Url,
        Posts = ctx.Posts.Where(y => y.BlogId == x.BlogId && y.Title == "A Specific Title").ToList()
    );
Lacutah
  • 208
  • 7
  • I don't think that you quite understand the question/problem. If you ignore the fact that this relates to EF, and just think about this as LINQ queries, then what I tried should work no problem at all. The real problem comes from translation to a store query. EF is sort of breaking LINQ convention by tying it closer to T-SQL. – Ian Robertson Jan 25 '21 at 07:33
-2

Just do exactly what exception message says! You'll need to change 'var' to explicit as well.

I had this and had same exception message as you got:

var GroupByM2S =
            dbContext.CatL1s
           .GroupBy(x => x.UserId);   

I changed to this. Tested and works fine.

IEnumerable<IGrouping<int, CatL1>> MsGrpByAsEnumerExplicit =              
            (dbContext.CatL1s).AsEnumerable()
           .GroupBy(x => x.UserId);

So basically change 'var' as I have. *The int here, IGrouping<int,...> is data type of your Grouping Key Prop/Col Then surround dbContext.EntityName with pars and then .AsEnumerable().GroupBy(...)

IEnumerable<IGrouping<dataTypeOfGpByKey, EntityName>> GrpByIEnumAsEnumExplicit =
        ( //<--Open Par dbCtx.EntityName
    .Join(
            outer: DbSet<Blog>,
            inner: p => EF.Property<Nullable<int>>(p, "BlogId"),
            outerKeySelector: b => EF.Property<Nullable<int>>(b, "BlogId"),
            innerKeySelector: (o, i) => new TransparentIdentifier<Post, Blog>(
                Outer = o,
                Inner = i
            )).AsEnumerable() //<-- Put here
                    .GroupBy(
                    source: p => p.Inner,
                    keySelector: p => p.Outer)
                    ...

Give that a shot for anyone with same exception message.

FlazzG
  • 25
  • 3
  • You can still use `var` here. The only difference is using `AsEnumerable()`, in other words: switching to client-side evaluation as the exception message suggests. But the question is more on the "why" of the change in `GroupBy` support and therefore, can't be answered. – Gert Arnold Mar 20 '21 at 19:57
  • It would not compile in my code using 'var'. Besides, why not be explicit when your know the type. I agree it is not the answer to more of the why, however it is a valid working answer to the exception message. – FlazzG Mar 20 '21 at 22:04
  • The problem I wanted to solve was to use LINQ and (very important "and") perform server side grouping. With/without the AsEnumberable, the LINQ on the face of it, should have the same behaviour. To me "AsEnumerable" doesn't imply - "lets do this grouping client side". For performance reasons developers would want to run grouping operations server side. But I would have liked to achieve it using expressive LINQ. – Ian Robertson Mar 21 '21 at 07:35
  • Of course you can use `var` in these isolated statements. As for "why not be explicit", you seem to have missed the reason for introducing the `var` keyword. – Gert Arnold Mar 22 '21 at 07:48