2

I'm using Entity 6 with PostgreSQL database (with Npgsql connector). Everything works fine, except for poor performance of this setup. When I try to insert not-so-large amount of objects to database (about 20k records), it takes much more time than it should. As this is my first time using Entity Framework, I was rather confused why inserting 20k records to database on my local machine would take more than 1 minute.

In order to optimize inserts I followed every tip I found. I tried to set AutoDetectChangesEnabled to false, call SaveChanges() every 100 or 1000 records, re-creating database context object and use DbContextTransaction objects (by calling dbContext.Database.BeginTransaction() and commiting transaction at the end of the operation or every 100/1000 records). Nothing improved inserts performance even a little bit.

By loging SQL queries generated by Entity I was finally able to discover that no matter what I do, every object is inserted separatedly and every insert takes 2-4 ms. Without re-creating DB context objects and without transactions, there is just one commit after over 20k inserts. When I use transactions and commit every few records, there are more commits and new transaction creations (same when I re-create DB context object, just with connection being re-established as well). If I use transactions and commit them every a few records, I should notice a performance boost, no? But in the end there is no difference in performance, no matter if I use multiple transactions or not. I know transactions won't improve performance drastically, but they should help at least a little bit. Instead, every insert still takes at least 2ms to execute on my local DB.

Database on local machine is one thing, but performing creation of 20k objects on remote database takes much, much, MUCH longer than one minute - logs indicate that single insert can take even 30ms (!), with transactions being commited and created again every 100 or 1000 records. On the other hand, if I execute a single insert manually (taking it straight from the log), it takes less than 1ms to execute. It seems like Entity takes its sweet time inserting every single object to database, even though it uses transactions to wrap larger amount of inserts together. I don't really get it...

What can I do to speed it up for real?

PookyFan
  • 610
  • 1
  • 6
  • 20
  • 1
    Someone may be able to elaborate further on this, but EF isn't designed for bulk inserts. – IronAces Apr 18 '17 at 14:26
  • EF can't help you out of the box. There are 3rd-party libraries that implement bulk insert on top of EF. – Gert Arnold Apr 18 '17 at 14:41
  • How my ms ping between the client and DB server? Also, are you using connection pooling software? – Joe Love Apr 18 '17 at 15:06
  • @JoeLove ping is 16-32ms. I'm not sure what do you mean by "connection pooling software", I just use EF6 as it is. – PookyFan Apr 18 '17 at 15:09
  • @PookyFan It's not a huge deal in this instance, but pgpool or pgbouncer-- it makes creating a db connection a lot quicker and less expensive on the server. Likely not an issue in this case, however, it CAN affect the round trip time (ping times between all 2-3 nodes), which in multiple inserts is likely the primary factor in speed. – Joe Love Apr 18 '17 at 15:18
  • 1
    What's happening is that your software is running multiple insert statements instead of a single insert with multiple records in it. The difference is that each SQL statement requires an entire "round trip" (minimum 2x ping time) per record. You need to modify the underlying SQL to use the bulk insert syntax. This may be done via plugin or manual manipulation of the SQL. Don't know enough about E 6 to offer a specific fix. Barring that, lowering ping times will make a huge difference (Ie put computers on same switch or box) – Joe Love Apr 18 '17 at 15:22
  • So I guess I really do need third party library for bulk inserts. Any recommendations? Can't find anything that supports PostgreSQL and is free of charge. – PookyFan Apr 18 '17 at 16:09
  • 1
    http://stackoverflow.com/questions/5940225/fastest-way-of-inserting-in-entity-framework – Hans Passant Apr 18 '17 at 17:09

1 Answers1

2

In case anyone's interested, I found a solution to my problem. Entity Framework 6 is unable to provide fast bulk inserts without additional third-party libraries (as mentioned in comments to my question), which are either expensive or not supporting other databases than SQL Server. Entity Framework Core, on the other hand, is another story. It supports fast bulk insertions and can replace EF 6 in project with just a bunch of changes in code: https://docs.microsoft.com/pl-pl/ef/core/index

PookyFan
  • 610
  • 1
  • 6
  • 20