How do you minimize the performance hit when upgrading to EF 4.1 from LINQ to SQL?

Question

I recently updated an app with LINQ to SQL and SQL Server CE 3.5 to Entity Framework 4.1 Code First and SQL Server CE 4.0, and it's now running noticeably slower. I did some before vs. after stopwatch testing, and most major operations of my app appear to be running about 40% slower on average.

I'm using all default strategies and configurations for EF Code First except for disabling cascading deletes.

When I originally posted this question, I was focused on one query that seemed to be taking particularly long, but I've since realized that it was only particularly slow on first run (see the comment thread below).

What I now think I'm seeing is that most queries are running slower--not dramatically slower, but slow enough to quickly add up as most operations the app performs involve several queries.

This app has a very small database. The SQL CE (.sdf) file is only 458 KB, and the largest table has less than 250 records.

Here's an example POCO class:

public class Target
{
    public int Id { get; set; }
    public int TrialDefinitionId { get; set; }
    public int Number { get; set; }
    public int X { get; set; }
    public int Y { get; set; }
    public string Phase { get; set; }
    public virtual TrialDefinition TrialDefinition { get; set; }
}

All my classes follow this basic pattern (simple types + virtual properties to obtain objects linked by foreign keys). I have one class that uses an ICollection to obtain a listing for a many-to-one relationship.

Final note: I'm using a repository pattern as a mediator, and each usage of a repository is placed in a using block. For "get" operations, this results in entities becoming detached once I've obtained the data I need from the database.

Does anyone have any specific strategies for improving the performance of my EF Code First app? Please keep in mind that I haven't had a chance yet to read up on EF in much detail. I'm mostly just trying to migrate as quickly and painlessly as possible from LINQ to SQL to EF. The most useful answer for me would be one that consists of changing specific strategies or configurations or other settings.

what makes you think 0.25 seconds is unreasonable? Do you have other performance numbers that would make this one stand out? It looks like the time is completely spent on transferring the data from the DB into a .NET list object. — BrokenGlass, Apr 04 '11 at 02:11
Is the time taken only on the first access or have you run this code multiple times in a row? I'm trying to determine if most of the time is spent establishing the initial connection and getting the db engine running. Once this is done, does it then start executing faster? — NotMe, Apr 04 '11 at 02:12
Does it take that long every time? I've found that there's a significant overhead associated with getting the first query "spooled-up" I have no idea what internal structures are being initialized on the first call, but they seem significant. After that it's generally smooth sailing. — Ralph Shillington, Apr 04 '11 at 02:13
@Chris, it's only that slow on the first run. Thereafter it takes about .06. But it's not the first use of the DataContext, and other queries of even larger tables don't seem to be taking nearly as long. — devuxer, Apr 04 '11 at 02:17
0.25 seconds is only unreasonable, if there's a requirement on performance of the overall app. If that's the case, you need to measure the perf on the entire app before you go barking at the DB. — Esteban Araya, Apr 04 '11 at 02:19
@Esteban, I started investigating this because my whole app seems to be running running much slower after converting my project from LINQ to SQL to Entity Framework. — devuxer, Apr 04 '11 at 02:24
@DanM: http://stackoverflow.com/questions/225121/profiler-for-sql-ce — NotMe, Apr 04 '11 at 02:27
@Alexander, I'm not sure if I'm *loading* the entities as detached. Basically, I'm using a repository pattern, and any time I use a repository, I place it inside a `using` clause. When all query operations are complete, the `DbContext` gets disposed, and I'm left with detached data. Is this what you meant? If not, can you please tell me how to load as detached? Thanks. — devuxer, Apr 04 '11 at 05:39
@Shiraz, thanks for the +1...I wasn't sure why I got downvoted either. — devuxer, Apr 10 '11 at 17:11

score 5 · Accepted Answer · answered Apr 12 '11 at 07:54

5

Final note: I'm using a repository pattern as a mediator, and each usage of a repository is placed in a using block. For "get" operations, this results in entities becoming detached once I've obtained the data I need from the database.

Well this is not required...

Entity Framework's default architecture already implements repository pattern.
Keeping ObjectContext alive does not mean you are keeping your connection to database alive.
Only when you are loading from or saving changes to database, a new connection from connection pool is grabbed and operation is performed.

Ofcourse, using block will slow down because each using block will do following,

Initialize Context (requires loading metadata from resources)
Validate few things
Open connection to DB
Perform your tasks
Clean up and close DB

Now first two steps sure will take lot of time, and you will have multiple objects of same type living longer in your app because each new context will create a new copy of same object for every query.

Entity Framework already implements Identity Map, that means that it will keep the object alive and only one copy of object for same primary key throughout the lifetime of context, that will not only save memory but will also perform faster.

I would advise to not use Using blocks for every query or smaller steps but rather, you should keep your ObjectContext alive throughout lifetime of your application. And you do not need to implement caching or repository at all.

answered Apr 12 '11 at 07:54

Akash Kava

37,127
20
114
162

Thank you, Akash. This will take some refactoring, but I definitely want to try it. Sounds like it might solve my speed issue *and* make my code simpler and easier to maintain. – devuxer Apr 12 '11 at 17:17
One question, though...why does `DbContext` implement `IDisposable` if you are not supposed to dispose it after each use? – devuxer Apr 12 '11 at 18:53
Well whether it should be disposed or not that is completely situational. For example in stateless web service architecture you should dispose after getting results. In your case, you are going to have only one instance per user on machine so it's not bad to keep objects cached and let object context live longer. But on server with webservices, it can lead to memory starvation if everyone will keep context longer. – Akash Kava Apr 12 '11 at 18:58
Ahh, gotcha. Makes perfect sense. – devuxer Apr 12 '11 at 19:35
Okay, the refactoring is complete. You win :) I think it may be even a bit faster than the original. Thanks for your help! – devuxer Apr 12 '11 at 20:17
Combining multiple small steps is a fine recommendation but "Lifetime of applicaton" is almost never the correct answer for ObjectContext lifetimes. See http://stackoverflow.com/questions/3653009/entity-framework-and-connection-pooling – Ian Mercer Apr 13 '11 at 07:06
@Hightechrider, thanks for your comment, but can you explain why the performance of my app was so much worse when I wrapped all my DbContext operations in `using` blocks? Since switching to a singleton DbContext (as Akash suggested), it runs significantly faster. Also, note that this is a single-user desktop app (a graphical editor). All user actions are saved immediately, and there are no dialogs with Cancel buttons. It's not clear to me how I would architect it so it divides neatly into "multiple small steps". – devuxer Apr 13 '11 at 07:23
2

Suppose your database disconnected while the application was running. How will you recover from that if you have a single object context for the application lifetime? Batching them up is fine but keeping one around for the lifetime of the application is rarely the correct solution. – Ian Mercer Apr 18 '11 at 16:24
So what is wrong if database was disconnected? Context will pull a new data base connection from connection pool, it will reconnect the database, your connection to database is already closed everytime by context once the load or save changes are done. – Akash Kava Apr 18 '11 at 16:27
2

@Akash, "loading metadata each time context initiaize" is not true. It actually caches. "The Entity Framework automatically caches metadata and other information it needs in the app domain, and ADO.NET pools database connections, so re-creating the context each time is a quick operation." – ashraf Aug 11 '11 at 22:57

score 1 · Answer 2 · answered Apr 04 '11 at 06:07

1

http://blogs.msdn.com/b/wriju/archive/2011/03/15/ado-net-entity-framework-performance-tips.aspx

answered Apr 04 '11 at 06:07

ErikEJ

36,894
5
64
101

score 0 · Answer 3 · answered Apr 04 '11 at 05:44

Read here and here about internal working of Entity framework. It is related to EFv4 and ObjectContext API but EFv4.1 with DbContext API is just wrapper around EFv4.

If you feel that your query is slow, try to execute it twice on the same context and twice on different instances of the context. The first test will check if the problem is in object materialization because objects will be materialized only for the first query and the second test will check if there is any problem with context initialization (this should not happen if you are using standard context creation with connection string).

It would be also interesting to compare execution with compiled query but I have a feeling that compiled queries are not part of DbContext API.

score 0 · Answer 4 · answered Apr 10 '11 at 13:15

0

When you have changed to a Code First approach, has this changed the structure of the database?

My guess is yes and this is what is causing the change in performance.

I also noticed on thing in your class, you have:

public int TrialDefinitionId { get; set; }

and:

public virtual TrialDefinition TrialDefinition { get; set; }

Are both of these required?

answered Apr 10 '11 at 13:15

Shiraz Bhaiji

60,773
31
133
239

My database structure remained exactly the same, actually. As for why I have both `TrialDefinitionId` and `TrialDefinition`, I believe it's common practice. It allows you to access all the fields for the record you are selecting as well as all the fields for any records that are linked via foreign keys. This is exactly how it worked in LINQ to SQL, and I don't think it causes a performance hit, because the data is not actually selected from the database unless used. See http://weblogs.asp.net/scottgu/archive/2010/07/16/code-first-development-with-entity-framework-4.aspx – devuxer Apr 10 '11 at 17:09
Do you also have the same indices as previously? – Shiraz Bhaiji Apr 10 '11 at 17:36
I never manually created any indices in the original database, so I doubt the Code First-generated database would have different indices, but I'm really not sure. – devuxer Apr 10 '11 at 17:50

How do you minimize the performance hit when upgrading to EF 4.1 from LINQ to SQL?

4 Answers4

Linked