734

I'm looking for the fastest way of inserting into Entity Framework.

I'm asking this because of the scenario where you have an active TransactionScope and the insertion is huge (4000+). It can potentially last more than 10 minutes (default timeout of transactions), and this will lead to an incomplete transaction.

Selim Yildiz
  • 4,224
  • 6
  • 14
  • 25
Bongo Sharp
  • 8,720
  • 8
  • 22
  • 35
  • 1
    How are you currently doing it? – Dustin Laine May 09 '11 at 17:16
  • Creating the TransactionScope, instantiating the DBContext, Opening the connection, and in a for-each statement doing the insertions and SavingChanges (for each record) , NOTE: TransactionScope and DBContext are in using statements, and i'm closing the connection in a finally block – Bongo Sharp May 09 '11 at 17:31
  • Another answer for reference: http://stackoverflow.com/questions/5798646/entity-framework-algorithm-for-combining-data – Ladislav Mrnka May 09 '11 at 21:35
  • Setting AutoDetectChangesEnabled to false could negatively affect applications that depend heavily on third party controls. Something to have in mind before rushing and breaking other parts of the application. – usefulBee Dec 03 '15 at 18:36
  • Awesome solutions can be seen here for on of the basic web task - persisting an array of rows with data.... Amazing framework... – Alexander Pravdin Feb 05 '19 at 14:13
  • 3
    The fastest way of inserting _into a SQL database_ does not involve EF. AFAIK Its BCP then TVP+Merge/insert. – StingyJack Mar 31 '19 at 04:02
  • 2
    For those who will read comments: Most applicable, modern [answer](https://stackoverflow.com/a/54822968/59081) is here. – Tanveer Badar Nov 15 '19 at 05:44
  • Take a look here https://www.ben-morris.com/optimising-bulk-inserts-with-entity-framework-6/ as well – DmitryBoyko Dec 06 '19 at 03:26

32 Answers32

1048

To your remark in the comments to your question:

"...SavingChanges (for each record)..."

That's the worst thing you can do! Calling SaveChanges() for each record slows bulk inserts extremely down. I would do a few simple tests which will very likely improve the performance:

  • Call SaveChanges() once after ALL records.
  • Call SaveChanges() after for example 100 records.
  • Call SaveChanges() after for example 100 records and dispose the context and create a new one.
  • Disable change detection

For bulk inserts I am working and experimenting with a pattern like this:

using (TransactionScope scope = new TransactionScope())
{
    MyDbContext context = null;
    try
    {
        context = new MyDbContext();
        context.Configuration.AutoDetectChangesEnabled = false;

        int count = 0;            
        foreach (var entityToInsert in someCollectionOfEntitiesToInsert)
        {
            ++count;
            context = AddToContext(context, entityToInsert, count, 100, true);
        }

        context.SaveChanges();
    }
    finally
    {
        if (context != null)
            context.Dispose();
    }

    scope.Complete();
}

private MyDbContext AddToContext(MyDbContext context,
    Entity entity, int count, int commitCount, bool recreateContext)
{
    context.Set<Entity>().Add(entity);

    if (count % commitCount == 0)
    {
        context.SaveChanges();
        if (recreateContext)
        {
            context.Dispose();
            context = new MyDbContext();
            context.Configuration.AutoDetectChangesEnabled = false;
        }
    }

    return context;
}

I have a test program which inserts 560.000 entities (9 scalar properties, no navigation properties) into the DB. With this code it works in less than 3 minutes.

For the performance it is important to call SaveChanges() after "many" records ("many" around 100 or 1000). It also improves the performance to dispose the context after SaveChanges and create a new one. This clears the context from all entites, SaveChanges doesn't do that, the entities are still attached to the context in state Unchanged. It is the growing size of attached entities in the context what slows down the insertion step by step. So, it is helpful to clear it after some time.

Here are a few measurements for my 560000 entities:

  • commitCount = 1, recreateContext = false: many hours (That's your current procedure)
  • commitCount = 100, recreateContext = false: more than 20 minutes
  • commitCount = 1000, recreateContext = false: 242 sec
  • commitCount = 10000, recreateContext = false: 202 sec
  • commitCount = 100000, recreateContext = false: 199 sec
  • commitCount = 1000000, recreateContext = false: out of memory exception
  • commitCount = 1, recreateContext = true: more than 10 minutes
  • commitCount = 10, recreateContext = true: 241 sec
  • commitCount = 100, recreateContext = true: 164 sec
  • commitCount = 1000, recreateContext = true: 191 sec

The behaviour in the first test above is that the performance is very non-linear and decreases extremely over time. ("Many hours" is an estimation, I never finished this test, I stopped at 50.000 entities after 20 minutes.) This non-linear behaviour is not so significant in all other tests.

Shimmy Weitzhandler
  • 92,920
  • 119
  • 388
  • 596
Slauma
  • 167,754
  • 56
  • 385
  • 407
  • I was not aware of the penalty in performance of calling the SaveChanges after each insertion, i'll try with the options you provided, Thanks for your comment, i'll let you know my results. – Bongo Sharp May 09 '11 at 21:48
  • I left a loadtest running with this approach, i'll let you know the results of it in the morning, but so far i did noticed a HUGE improvement by doing the SaveChanges every 100 records – Bongo Sharp May 10 '11 at 06:43
  • 93
    @Bongo Sharp: Don't forget to set `AutoDetectChangesEnabled = false;` on the DbContext. It also has a big additional performance effect: http://stackoverflow.com/questions/5943394/why-is-inserting-entities-in-ef-4-1-so-slow-compared-to-objectcontext/5943699#5943699 – Slauma May 10 '11 at 09:03
  • 7
    Yeah, the problem is that i'm using Entity Framework 4, and AutoDetectChangesEnabled is part of the 4.1, nevertheless, i did the performance test and i had AMAZING RESULTS, it went from 00:12:00 to 00:00:22 SavinChanges on each entity was doing the olverload... THANKS so much for your answare! this is what i was looking for – Bongo Sharp May 10 '11 at 16:24
  • 11
    Thank you for the context.Configuration.AutoDetectChangesEnabled = false; tip, it makes a **huge** difference. – douglaz May 08 '12 at 23:14
  • Hi I m trying to use this code. But it gives me error saying AutoDetectChangesEnabled cannot be found. – Bat_Programmer Jul 05 '12 at 21:26
  • 1
    @dahacker89: Are you using the correct version EF >= 4.1 and `DbContext`, NOT `ObjectContext`? – Slauma Jul 05 '12 at 21:48
  • Thanks for correcting me. Yes I am using ObjectContext. Just converted my model to DBContext. I have another problem now. This line: using (TransactionScope scope = new TransactionScope()) is giving me error saying that TransactionScope used in a using statement must be implicitly convertible to System.IDisposable. Any idea how to solve this??? – Bat_Programmer Jul 05 '12 at 22:04
  • @dahacker89: Check the correct namespace. You must use the `TransactionScope` class in `System.Transactions` which is disposable: http://msdn.microsoft.com/en-us/library/system.transactions.transactionscope.aspx – Slauma Jul 06 '12 at 10:05
  • @Slauma I did include that namespace. It does recognize but keeps giving me that error. And also it says scope.Complete() cannot be found or something. So I took TransactionScope and ran my code. But I still want to test it out though :( – Bat_Programmer Jul 06 '12 at 23:56
  • 3
    @dahacker89: I suggest that you create a separate question for your problem with perhaps more details. I'm not able to figure out here what is wrong. – Slauma Jul 07 '12 at 11:13
  • I set the `IsolationLevel` for my transaction like so: `using (TransactionScope trans = new TransactionScope(TransactionScopeOption.Required, new TransactionOptions { IsolationLevel = IsolationLevel.ReadUncommitted }))` - I'm not sure if it's necessary, but I'm really hoping to avoid any deadlocks or long wait times while the data inserts. – jocull Oct 03 '12 at 05:57
  • Sorry... that should be `TransactionScopeOption.Suppress` - see also: http://stackoverflow.com/questions/2921920/transactionscope-prematurely-completed – jocull Oct 03 '12 at 06:56
  • Just quickly, fantastic information here. Thanks Sluama! What's the use of the TransactionScope though? I've never used one before when dealing with EF and insert/update/deletes. – Trent Apr 02 '13 at 07:38
  • @FizzBuzz: It wraps multiple `SaveChanges` calls into a single database transaction. In most cases it's not needed. – Slauma Apr 02 '13 at 10:38
  • About context recreating: if you create objects in new context and set their references to objects that where created and saved in previous context - reference objects would be added to new context with EntityState.Added and on SaveChanges context would try to save objects second time regardless of primarykeyid>0 – blazkovicz Apr 16 '13 at 07:19
  • 1
    Strewth- simply turning off the AutoDetectChangesEnabled made a MASSIVE difference to my app, which adds 200,000 records to a collection before hitting one call to save changes! I previously stopped the app after 30 mins, now it whips through them all in no time at all! – Shawson Apr 20 '13 at 20:05
  • Wow, what a nice job. It improved my adding with lightningspeed. – Erik Bergstedt Jul 23 '13 at 13:03
  • Can you, please, look through my question: http://stackoverflow.com/questions/18645557/entity-framework-5-poor-performance?noredirect=1#comment27454411_18645557. Thanks! – Vjacheslav Ravdin Sep 05 '13 at 21:47
  • the type Entity in AddToContext() is it some sort of base entity type ? What I am looking for is generic AddToContext() where I can pass different types of entities and use gettype() or something in DBSet. I am using EF6 Db first model.Either ways its a great answer – Neel Feb 11 '14 at 15:53
  • @Neel: It should be possible to use `Entity` directly as a generic parameter of a generic version `AddToContext(...) where Entity : class`. – Slauma Feb 12 '14 at 00:59
  • @Slauma yes, that's how I got it done, generics and some help from Reflection. – Neel Feb 12 '14 at 01:40
  • @Slauma would this code and results be similar if nested/relational entities were involved? Does the 'AutoDetectChangesEnabled' affect the relationships? – Learner Jun 25 '14 at 11:56
  • 1
    @Cristi: `AutoDetectChangesEnabled` affects related entities as well, yes, and probably also with better performance. I don't have concrete numbers though. However, you must be careful with disabling change detection and test the code more thoroughly if relationships are involved: http://stackoverflow.com/questions/23636920/what-are-the-bugs-that-can-be-caused-in-ef-by-disabling-automatic-change-detecti – Slauma Jun 25 '14 at 13:17
  • Are the insertion of all the records put in one huge transaction in the example? Since the using `TransactionScope` is at the outermost? – ca9163d9 Aug 29 '14 at 14:06
  • @dc7a9163d9: Yes, it's one transaction. – Slauma Aug 31 '14 at 11:42
  • @slauma so it may cause problem if the code is used for inserting millions of rows – ca9163d9 Aug 31 '14 at 15:30
  • I wish I had learned about this stuff like two years ago when we were starting on the project I have and not now when it's extremely painful to account for it. – Casey Apr 09 '15 at 14:36
  • This code works pretty well for `AddOrUpdate` as well. It's not great, but it's better than nothing. Long term I think I would want to change it to a MERGE. – Jess Jun 29 '15 at 19:30
  • 1
    Unfortunately, the answer does not specify the negative implications to setting AutoDetectChangesEnabled to false. Setting it to false can break apps, and that is why it is a bad practice; otherwise, it would have been set to false by default. Possible break down could happen if the application uses third party controls. – usefulBee Dec 03 '15 at 18:30
  • Since Bulk insert is an extension method of DbContext and not DbSet Can it update multiple sets at the same time ? – eran otzap Nov 24 '16 at 08:09
  • Has anyone noticed a very recent improvement in performance in the scenario of inserting data via E.F. into Azure SQL? I noticed better performance recently without me having changed anything on the Azure SQL specs. – donquijote Mar 06 '17 at 19:48
  • Will this be edited for the latest versions of .net and ef? I'm curious to find out if the newest version of ef can handle 1000 rows faster – DarthVegan Aug 10 '17 at 20:14
  • I think the underlying connection is opened and closed every time you re-create the context, right? Would it not be better to create a connection manually and pass it to the context? This way, the connection would not be opened and closed per context instantiation. – Niklas Peter Feb 17 '18 at 10:27
  • Does anyone know if it is possible to apply this approach in the Entity Framework Core? – Bruno Henri Mar 02 '18 at 20:52
  • Man this made my code so much faster I have 600,000 items to put into the database and this seems to do it in about 20 minutes where as before it was taking hours. – Anglefroghammer Mar 28 '18 at 08:52
  • Can you explain why recreating the context helps so much? Also, how did you came up with this idea?? – Maverick Meerkat Jun 15 '18 at 08:01
  • In my real world scenario, only saving after every 1000 records, rebuilding the context, and disabling AutoDetectChanges leads to a solid 10x speed improvement. FYI to others, AutoDetectChanges becomes more and more costly the more items that are cached in the context, and thus it degrades to become worse and worse performing the more items you insert. – Greg Jul 03 '18 at 14:11
  • What if error occurs in one record of 100 , does it discards all 100?? – Parshuram Kalvikatte Sep 28 '18 at 02:36
  • I've used this method for an array of ~40K Entities with 90 Fields, saving and renewing after 1000 iterations. Takes about a minute for the batch to complete. Nice work. Still not fast enough, but it may be me – jDave1984 Dec 15 '18 at 21:23
  • great summary, I was batching, killing the context and setting the configuration, but after using Set increased my speed by even 3 times more! – WtFudgE Jan 05 '19 at 16:49
  • I am getting `Network access for Distributed Transaction Manager (MSDTC) has been disabled. Please enable DTC for network access in the security configuration for MSDTC using the Component Services Administrative tool.` exception. – StNickolas Mar 25 '19 at 08:03
  • it would be usefull adding timeout spec to the TransactionScope object extending where occurs: using (TransactionScope scope = new TransactionScope( TransactionScopeOption.Required, new System.TimeSpan(0, 30, 0))) I also having time long lasting issue with this solution... as soon as the attempts are finished I'll let you know how long it took – Ciro Corvino Aug 25 '20 at 21:22
  • @Slauma recreating context causes a big issue. It duplicates the first commit block at the end of the transaction. – Ciro Corvino Aug 27 '20 at 23:19
  • I was able to solve the problem mentioned in the previous comment and finally do a bulk insert test without vpn. I confirm that with 200K data records consisting of 43 fields and each record of about 500 characters (in Sql Server nvarchars corresponding at 1000B per record) the system took 20 minutes (10K rec/min). – Ciro Corvino Aug 28 '20 at 10:44
  • (it continues previuous...) That's quite far from the time reported in the OP, but comparing the entitis sizes as well I think it's a good speed for EF. – Ciro Corvino Aug 28 '20 at 10:45
181

This combination increase speed well enough.

context.Configuration.AutoDetectChangesEnabled = false;
context.Configuration.ValidateOnSaveEnabled = false;
arkhivania
  • 2,116
  • 1
  • 13
  • 8
  • This also made a different of at least an order of magnitude for me, when inserting around 40,000 records in a table. – Richard Ev Nov 28 '12 at 15:56
  • I had to add around 150,000 entities and it took forever. I've tried many approaches, but this one is the simplest and the best! Thanks. – graumanoz Mar 17 '13 at 21:40
  • Saving 23000 records, and this is the easiest way to make the performance acceptable. – deerchao Jul 31 '13 at 10:05
  • 52
    Don't blindly disable ValidateOnSaveEnabled you may be depending on that behavior, and not realize it until it's too late. Then again you may be performing validation elsewhere in code and having EF validate yet again is completely unnecessary. – Jeremy Cook Sep 05 '13 at 16:28
  • 1
    In my test saving 20.000 rows went down from 101 seconds to 88 seconds. Not a lot and what are the implications. – AH. Jan 24 '14 at 13:22
  • 28
    @JeremyCook I think what you're trying to get at is this answer would be much better if it explained the possible implications of changing these properties from their default values (aside from performance improvement). I agree. – pseudocoder Apr 29 '14 at 04:17
  • 2
    This worked for me, although if you're updating records in the context you will need to call DetectChanges() explicitly – hillstuk Sep 01 '14 at 13:13
  • 2
    These can be disabled and then re-enabled with a try-finally block: https://msdn.microsoft.com/en-us/data/jj556205.aspx – yellavon Apr 11 '15 at 17:42
  • Saving about 350,000 records to a new, empty database, this made zero difference. So, you just need to test your own scenario. – CAD bloke Apr 28 '15 at 13:03
94

The fastest way would be using bulk insert extension, which I developed

note: this is a commercial product, not free of charge

It uses SqlBulkCopy and custom datareader to get max performance. As a result it is over 20 times faster than using regular insert or AddRange EntityFramework.BulkInsert vs EF AddRange

usage is extremely simple

context.BulkInsert(hugeAmountOfEntities);
Jean-François Fabre
  • 126,787
  • 22
  • 103
  • 165
maxlego
  • 4,664
  • 3
  • 28
  • 38
  • My Context does not have a .BulkInsert why would this be/ Using EF 6.1 – Zapnologica Jul 22 '14 at 11:10
  • 1
    A cool wrapper/tool, but the standing drawback of BulkCopy is that it doesn't support upserts directly - and I really don't want to add a staging table to my staging tables. – user2864740 Feb 05 '15 at 21:07
  • 11
    Fast but only does the top layer of a hierarchy. – CAD bloke Apr 29 '15 at 04:02
  • 1
    I tried to use this solution, but the order of columns mapped in bulk update is different to actual, so if I have columns in db as ABCD and corresponding object has data for objects ABCD, the values get mis mapped to something like BCAD... do you know what that might be ? – Muds Jun 04 '15 at 16:11
  • 6
    @Muds I also found a couple of bugs with this library and the MappingAPI library it uses. I had better success with the EFUtilities library from [this other answer](http://stackoverflow.com/a/27564352/1730559). It also has other utilities including for batch updates. – kjbartel Nov 09 '15 at 07:15
  • This is amazing, I went from minutes to a few seconds! I was using AddRange() in EF 6, even with change detection disabled, it was still minutes to do just 1200 records with 110 columns. Great work, saved me writing my own. – Ben Humphrey May 06 '16 at 18:00
  • Do you need to call SaveChanges after running BulkInsert, or does BulkInsert operate directly on the database? For example, I add a "Message" record and call SaveChanges, because the message ID will be used as a foreign key when BulkInsert is called to add user inbox entries that all reference the message. So I do: Add Message, call SaveChanges to commit the new message, call BulkInsert to add inbox entires. And I'm wondering if I need another call to SaveChanges after BulkInsert or if BulkInsert commits directly to the database such that SaveChanges doesn't need called afterwards. – Triynko Jul 12 '16 at 17:56
  • 1
    @Triynko. No need to call SaveChanges. Usage instructions are here https://efbulkinsert.codeplex.com/ – maxlego Jul 18 '16 at 20:39
  • I have used this but it raised an error "The given key was not present in the dictionary" my case is list of entities with relation but I am trying to insert without the relation object – N0rA Aug 15 '16 at 14:19
  • @muds there is a problem in bulk update, The order of the properties in your entity class should be the same as the order of the columns in your database table. – Shobhit Walia Sep 09 '16 at 18:36
  • No way, this improved my codes to insert 400k+ rows (with 5 columns and no index restrictions) from 60+ mins (using the technique from the accepted answer) to 10.2 seconds. You just saved my life mate! – Ken Timothy Oct 14 '16 at 13:47
  • 78
    It is not free. – Amir Saniyan Jun 04 '17 at 14:41
  • 85
    Ads are getting smarter... this is paid product and very expensive for a freelance. Be warned! – JulioQc Jun 13 '17 at 17:30
  • 5
    sadly this is no longer FOSS, at some point the owner moved it to another domain and started charging for the bulk insert version. It's no longer a suitable option :-( – matao Jun 27 '17 at 01:12
  • 45
    USD600 for 1-year support and upgrades? Are you out of your mind? – Camilo Terevinto Sep 18 '17 at 10:37
  • 9
    im not the owner of the product any longer – maxlego Sep 25 '17 at 12:57
  • 2
    There are several forks of BulkInsert that can be found on GitHub from before the codeplex source was bought by ZZZ Projects. [This one](https://github.com/ghost1face/EntityFramework.BulkInsert) can even be found on NuGet. – Rudey Feb 08 '18 at 15:35
  • 2
    Hugely over priced for this functionality. – Paul Johnson Jun 12 '18 at 11:38
  • 3
    It is free, and I released an updated version from the original source code under the original license before ZZZ took over and made it pay-for-play. I got the source code from the original project on the original site (no longer exists, but I saved screenshots and all the licensing info). WayBackMaching FTW! I fixed a serious bug in the MappingAPI with how it handles TPH types, so it can now handle shared properties in a complex TPH scheme. I also added support for async operations (i.e. WriteToServerAsync) and added the BulkInsertAsync methods. I also added support for TPC types. :) – Triynko Aug 03 '18 at 15:30
  • 1
    I also updated the way it manages connections. It was creating it's own SqlConnection (from the DbContext's Database.Connection.ConnectionString), but I changed it so it can reusing the existing connection. The problem with allowing it to open a second connection of its own, is if you're running it in a transaction, it will escalate to a distributed transaction, and the bulk insert takes a whole second to insert a few rows, defeating the whole purpose. So there's that. – Triynko Aug 03 '18 at 15:33
  • 1
    @Triynko, does it has a Git? – xdtTransform Jan 08 '19 at 15:31
  • 4
    If you're using EF Core, this exists. https://github.com/borisdj/EFCore.BulkExtensions – DharmaTurtle Aug 10 '19 at 11:45
  • This one is much better, free and open source: https://entityframework-plus.net/ – jjxtra Dec 19 '20 at 00:19
86

You should look at using the System.Data.SqlClient.SqlBulkCopy for this. Here's the documentation, and of course there are plenty of tutorials online.

Sorry, I know you were looking for a simple answer to get EF to do what you want, but bulk operations are not really what ORMs are meant for.

Adam Rackis
  • 79,454
  • 49
  • 255
  • 377
  • 1
    I have run into the SqlBulkCopy a couple of times while researching this, but it seems to be more oriented to table-to-table inserts, saddly i was not expecting easy solutions, but rather performance tips, like for example managing the State of the connection manually, insted of letting EF do it for you – Bongo Sharp May 09 '11 at 17:27
  • 7
    I've used SqlBulkCopy to insert large amounts of data right from my application. You basically have to create a DataTable, fill it up, then pass *that* to BulkCopy. There are a few gotchas as you're setting up your DataTable (most of which I've forgotten, sadly), but it should work just fine – Adam Rackis May 09 '11 at 17:36
  • 2
    I did the proof of concept, and as promissed, it works really fast, but one of the reasons why i'm using EF is becuase the insertion of relational data is easier, Eg if i'm insert an entity that already contains relational data, it will also insert it, have you ever got into this scenario? Thanks! – Bongo Sharp May 09 '11 at 23:47
  • 2
    Unfortunately inserting a web of objects into a DBMS is not really something BulkCopy will do. That's the benefit of an ORM like EF, the cost being that it won't scale to do hundreds of similar object graphs efficiently. – Adam Rackis May 10 '11 at 01:02
  • 2
    SqlBulkCopy is definitely the way to go if you need raw speed or if you will be re-running this insert. I've inserted several million records with it before and it is extremely fast. That said, unless you will need to re-run this insert, it might be easier to just use EF. – Neil Apr 26 '12 at 16:09
54

I agree with Adam Rackis. SqlBulkCopy is the fastest way of transferring bulk records from one data source to another. I used this to copy 20K records and it took less than 3 seconds. Have a look at the example below.

public static void InsertIntoMembers(DataTable dataTable)
{           
    using (var connection = new SqlConnection(@"data source=;persist security info=True;user id=;password=;initial catalog=;MultipleActiveResultSets=True;App=EntityFramework"))
    {
        SqlTransaction transaction = null;
        connection.Open();
        try
        {
            transaction = connection.BeginTransaction();
            using (var sqlBulkCopy = new SqlBulkCopy(connection, SqlBulkCopyOptions.TableLock, transaction))
            {
                sqlBulkCopy.DestinationTableName = "Members";
                sqlBulkCopy.ColumnMappings.Add("Firstname", "Firstname");
                sqlBulkCopy.ColumnMappings.Add("Lastname", "Lastname");
                sqlBulkCopy.ColumnMappings.Add("DOB", "DOB");
                sqlBulkCopy.ColumnMappings.Add("Gender", "Gender");
                sqlBulkCopy.ColumnMappings.Add("Email", "Email");

                sqlBulkCopy.ColumnMappings.Add("Address1", "Address1");
                sqlBulkCopy.ColumnMappings.Add("Address2", "Address2");
                sqlBulkCopy.ColumnMappings.Add("Address3", "Address3");
                sqlBulkCopy.ColumnMappings.Add("Address4", "Address4");
                sqlBulkCopy.ColumnMappings.Add("Postcode", "Postcode");

                sqlBulkCopy.ColumnMappings.Add("MobileNumber", "MobileNumber");
                sqlBulkCopy.ColumnMappings.Add("TelephoneNumber", "TelephoneNumber");

                sqlBulkCopy.ColumnMappings.Add("Deleted", "Deleted");

                sqlBulkCopy.WriteToServer(dataTable);
            }
            transaction.Commit();
        }
        catch (Exception)
        {
            transaction.Rollback();
        }

    }
}
akjoshi
  • 14,589
  • 13
  • 94
  • 116
Irfons
  • 571
  • 4
  • 3
  • 1
    I tried many of the solutions provided in this post and SqlBulkCopy was by far the fastest. Pure EF took 15min, but with a mix of the solution and SqlBulkCopy I was able to get down to 1.5 min! This was with 2 million records! Without any DB index optimization. – jonas Dec 02 '15 at 09:07
  • List is easier than DataTable. There's an `AsDataReader()` extension method, explained in this answer: http://stackoverflow.com/a/36817205/1507899 – RJB Apr 24 '16 at 01:58
  • But its only for top Entity not relational one – Zahid Mustafa Jun 04 '17 at 05:58
  • 1
    @ZahidMustafa: yeah. It's doing BulkInsert, not Bulk-Analysis-And-Relation-Tracing-On-Object-Graphs.. if you want to cover relations, you have to analyze and determine insertion order and then bulk-insert individual levels and maybe update some keys as needed, and you will get speedy custom tailored solution. Or, you can rely on EF to do that, no work on your side, but slower at runtime. – quetzalcoatl Jun 30 '17 at 16:15
43

as it was never mentioned here I want to recomment EFCore.BulkExtensions here

context.BulkInsert(entitiesList);                 context.BulkInsertAsync(entitiesList);
context.BulkUpdate(entitiesList);                 context.BulkUpdateAsync(entitiesList);
context.BulkDelete(entitiesList);                 context.BulkDeleteAsync(entitiesList);
context.BulkInsertOrUpdate(entitiesList);         context.BulkInsertOrUpdateAsync(entitiesList);         // Upsert
context.BulkInsertOrUpdateOrDelete(entitiesList); context.BulkInsertOrUpdateOrDeleteAsync(entitiesList); // Sync
context.BulkRead(entitiesList);                   context.BulkReadAsync(entitiesList);
Manfred Wippel
  • 1,106
  • 10
  • 13
25

I would recommend this article on how to do bulk inserts using EF.

Entity Framework and slow bulk INSERTs

He explores these areas and compares perfomance:

  1. Default EF (57 minutes to complete adding 30,000 records)
  2. Replacing with ADO.NET Code (25 seconds for those same 30,000)
  3. Context Bloat- Keep the active Context Graph small by using a new context for each Unit of Work (same 30,000 inserts take 33 seconds)
  4. Large Lists - Turn off AutoDetectChangesEnabled (brings the time down to about 20 seconds)
  5. Batching (down to 16 seconds)
  6. DbTable.AddRange() - (performance is in the 12 range)
ShaTin
  • 2,459
  • 16
  • 8
21

I've investigated Slauma's answer (which is awesome, thanks for the idea man), and I've reduced batch size until I've hit optimal speed. Looking at the Slauma's results:

  • commitCount = 1, recreateContext = true: more than 10 minutes
  • commitCount = 10, recreateContext = true: 241 sec
  • commitCount = 100, recreateContext = true: 164 sec
  • commitCount = 1000, recreateContext = true: 191 sec

It is visible that there is speed increase when moving from 1 to 10, and from 10 to 100, but from 100 to 1000 inserting speed is falling down again.

So I've focused on what's happening when you reduce batch size to value somewhere in between 10 and 100, and here are my results (I'm using different row contents, so my times are of different value):

Quantity    | Batch size    | Interval
1000    1   3
10000   1   34
100000  1   368

1000    5   1
10000   5   12
100000  5   133

1000    10  1
10000   10  11
100000  10  101

1000    20  1
10000   20  9
100000  20  92

1000    27  0
10000   27  9
100000  27  92

1000    30  0
10000   30  9
100000  30  92

1000    35  1
10000   35  9
100000  35  94

1000    50  1
10000   50  10
100000  50  106

1000    100 1
10000   100 14
100000  100 141

Based on my results, actual optimum is around value of 30 for batch size. It's less than both 10 and 100. Problem is, I have no idea why is 30 optimal, nor could have I found any logical explanation for it.

Admir Tuzović
  • 10,517
  • 7
  • 31
  • 69
  • 2
    I found the same with Postrges and pure SQL (it's depends on SQL not on EF) that 30 is optimal. – Kamil Gareev Jan 13 '16 at 09:52
  • My experience is that optimum differs for different connection speed and size of row. For fast connection and small rows optimum can be even >200 rows. – jing Nov 25 '19 at 11:15
20

As other people have said SqlBulkCopy is the way to do it if you want really good insert performance.

It's a bit cumbersome to implement but there are libraries that can help you with it. There are a few out there but I will shamelesslyplug my own library this time: https://github.com/MikaelEliasson/EntityFramework.Utilities#batch-insert-entities

The only code you would need is:

 using (var db = new YourDbContext())
 {
     EFBatchOperation.For(db, db.BlogPosts).InsertAll(list);
 }

So how much faster is it? Very hard to say because it depends on so many factors, computer performance, network, object size etc etc. The performance tests I've made suggests 25k entities can be inserted at around 10s the standard way on localhost IF you optimize your EF configuration like mentioned in the other answers. With EFUtilities that takes about 300ms. Even more interesting is that I have saved around 3 millions entities in under 15 seconds using this method, averaging around 200k entities per second.

The one problem is ofcourse if you need to insert releated data. This can be done efficently into sql server using the method above but it requires you to have an Id generation strategy that let you generate id's in the app-code for the parent so you can set the foreign keys. This can be done using GUIDs or something like HiLo id generation.

Mikael Eliasson
  • 4,887
  • 20
  • 27
  • Works well. The syntax is a bit verbose though. Think it would be better if `EFBatchOperation` had a constructor which you pass in the `DbContext` to rather than passing to every static method. Generic versions of `InsertAll` and `UpdateAll` which automatically find the collection, similar to `DbContext.Set`, would be good too. – kjbartel Nov 10 '15 at 00:44
  • 1
    Just a quick comment to say thanks! This code allowed me to save 170k records in 1.5 seconds! Completely blows any other method I've tried out of the water. – Tom Glenn Dec 05 '16 at 15:39
  • @Mikael One issue is dealing with identity fields. Do you have a way to enable identity insert yet? – Joe Phillips Apr 17 '17 at 21:57
  • 3
    In contrast to EntityFramework.BulkInsert, this library remained free. +1 – Rudey Oct 12 '19 at 08:18
  • Is it applicable for EF Core? – Bronek Mar 04 '21 at 10:47
15

Dispose() context create problems if the entities you Add() rely on other preloaded entities (e.g. navigation properties) in the context

I use similar concept to keep my context small to achieve the same performance

But instead of Dispose() the context and recreate, I simply detach the entities that already SaveChanges()

public void AddAndSave<TEntity>(List<TEntity> entities) where TEntity : class {

const int CommitCount = 1000; //set your own best performance number here
int currentCount = 0;

while (currentCount < entities.Count())
{
    //make sure it don't commit more than the entities you have
    int commitCount = CommitCount;
    if ((entities.Count - currentCount) < commitCount)
        commitCount = entities.Count - currentCount;

    //e.g. Add entities [ i = 0 to 999, 1000 to 1999, ... , n to n+999... ] to conext
    for (int i = currentCount; i < (currentCount + commitCount); i++)        
        _context.Entry(entities[i]).State = System.Data.EntityState.Added;
        //same as calling _context.Set<TEntity>().Add(entities[i]);       

    //commit entities[n to n+999] to database
    _context.SaveChanges();

    //detach all entities in the context that committed to database
    //so it won't overload the context
    for (int i = currentCount; i < (currentCount + commitCount); i++)
        _context.Entry(entities[i]).State = System.Data.EntityState.Detached;

    currentCount += commitCount;
} }

wrap it with try catch and TrasactionScope() if you need, not showing them here for keeping the code clean

RobJan
  • 1,191
  • 1
  • 11
  • 18
Stephen Ho
  • 151
  • 2
  • 4
  • 1
    That slowed down the insert (AddRange) using Entity Framework 6.0. Inserting 20.000 rows went up from about 101 seconds to 118 seconds. – AH. Jan 24 '14 at 13:12
  • 1
    @Stephen Ho: I am also trying to avoid disposing my context. I can understand this is slower than recreating the context, but I want to know if you found this faster enough than not recreating the context but with a commitCount set. – Learner Jul 09 '14 at 09:02
  • @Learner: I think it was faster than recreate the context. But I don't really remember now cos I switched to use SqlBulkCopy at last. – Stephen Ho Jul 16 '14 at 09:41
  • I ended up having to use this technique because, for some weird reason, there was some left over tracking occurring on the second pass through the while loop, even though I had everything wrapped in a using statement and even called Dispose() on the DbContext. When I would add to the context (on the 2nd pass) the context set count would jump to 6 instead of just one. The other items that got arbitrarily added had already been inserted in the first pass through the while loop so the call to SaveChanges would fail on the second pass (for obvious reasons). – Hallmanac Aug 24 '14 at 13:16
12

I know this is a very old question, but one guy here said that developed an extension method to use bulk insert with EF, and when I checked, I discovered that the library costs $599 today (for one developer). Maybe it makes sense for the entire library, however for just the bulk insert this is too much.

Here is a very simple extension method I made. I use that on pair with database first (do not tested with code first, but I think that works the same). Change YourEntities with the name of your context:

public partial class YourEntities : DbContext
{
    public async Task BulkInsertAllAsync<T>(IEnumerable<T> entities)
    {
        using (var conn = new SqlConnection(Database.Connection.ConnectionString))
        {
            await conn.OpenAsync();

            Type t = typeof(T);

            var bulkCopy = new SqlBulkCopy(conn)
            {
                DestinationTableName = GetTableName(t)
            };

            var table = new DataTable();

            var properties = t.GetProperties().Where(p => p.PropertyType.IsValueType || p.PropertyType == typeof(string));

            foreach (var property in properties)
            {
                Type propertyType = property.PropertyType;
                if (propertyType.IsGenericType &&
                    propertyType.GetGenericTypeDefinition() == typeof(Nullable<>))
                {
                    propertyType = Nullable.GetUnderlyingType(propertyType);
                }

                table.Columns.Add(new DataColumn(property.Name, propertyType));
            }

            foreach (var entity in entities)
            {
                table.Rows.Add(
                    properties.Select(property => property.GetValue(entity, null) ?? DBNull.Value).ToArray());
            }

            bulkCopy.BulkCopyTimeout = 0;
            await bulkCopy.WriteToServerAsync(table);
        }
    }

    public void BulkInsertAll<T>(IEnumerable<T> entities)
    {
        using (var conn = new SqlConnection(Database.Connection.ConnectionString))
        {
            conn.Open();

            Type t = typeof(T);

            var bulkCopy = new SqlBulkCopy(conn)
            {
                DestinationTableName = GetTableName(t)
            };

            var table = new DataTable();

            var properties = t.GetProperties().Where(p => p.PropertyType.IsValueType || p.PropertyType == typeof(string));

            foreach (var property in properties)
            {
                Type propertyType = property.PropertyType;
                if (propertyType.IsGenericType &&
                    propertyType.GetGenericTypeDefinition() == typeof(Nullable<>))
                {
                    propertyType = Nullable.GetUnderlyingType(propertyType);
                }

                table.Columns.Add(new DataColumn(property.Name, propertyType));
            }

            foreach (var entity in entities)
            {
                table.Rows.Add(
                    properties.Select(property => property.GetValue(entity, null) ?? DBNull.Value).ToArray());
            }

            bulkCopy.BulkCopyTimeout = 0;
            bulkCopy.WriteToServer(table);
        }
    }

    public string GetTableName(Type type)
    {
        var metadata = ((IObjectContextAdapter)this).ObjectContext.MetadataWorkspace;
        var objectItemCollection = ((ObjectItemCollection)metadata.GetItemCollection(DataSpace.OSpace));

        var entityType = metadata
                .GetItems<EntityType>(DataSpace.OSpace)
                .Single(e => objectItemCollection.GetClrType(e) == type);

        var entitySet = metadata
            .GetItems<EntityContainer>(DataSpace.CSpace)
            .Single()
            .EntitySets
            .Single(s => s.ElementType.Name == entityType.Name);

        var mapping = metadata.GetItems<EntityContainerMapping>(DataSpace.CSSpace)
                .Single()
                .EntitySetMappings
                .Single(s => s.EntitySet == entitySet);

        var table = mapping
            .EntityTypeMappings.Single()
            .Fragments.Single()
            .StoreEntitySet;

        return (string)table.MetadataProperties["Table"].Value ?? table.Name;
    }
}

You can use that against any collection that inherit from IEnumerable, like that:

await context.BulkInsertAllAsync(items);
Guilherme
  • 4,518
  • 4
  • 33
  • 52
  • please complete your example code. where is bulkCopy – Seabizkit Jun 01 '18 at 14:06
  • 2
    It is already here: `await bulkCopy.WriteToServerAsync(table);` – Guilherme Sep 05 '18 at 13:43
  • Maybe i wasn't clear, in your write up, you suggest you made an extension...which i took to mean that no 3rd part lib was needed, when in fact in both methods use SqlBulkCopy lib. This entirely relies on SqlBulkCopy, when why i asked where does bulkCopy come from, its an extension lib which you wrote an extension lib on top of. Would of just made more sense to say here is how i used SqlBulkCopy lib. – Seabizkit Sep 12 '18 at 14:29
  • should use conn.OpenAsync in async version – Robert Jan 14 '19 at 08:41
  • @guiherme Am I correct that the SqlBulkCopy in your code is really the SqlClient.SqlBulkCopy class built in to .net? – SOHO Developer Jun 18 '20 at 00:29
  • @SOHODeveloper Yes, you are correct. This code uses Entity to gather information about the table and column names, and then generate a "default" bulk insert that can be used with SqlClient.SqlBulkCopy. – Guilherme Jun 18 '20 at 16:40
11

[2019 Update] EF Core 3.1

Following what have been said above, disabling AutoDetectChangesEnabled in EF Core worked perfectly: the insertion time was divided by 100 (from many minutes to a few seconds, 10k records with cross tables relationships)

The updated code is :

context.ChangeTracker.AutoDetectChangesEnabled = false;
foreach (IRecord record in records) {
    //Add records to your database        
}
context.ChangeTracker.DetectChanges();
context.SaveChanges();
context.ChangeTracker.AutoDetectChangesEnabled = true; //do not forget to re-enable
julianstark999
  • 2,792
  • 1
  • 24
  • 34
XavierAM
  • 1,067
  • 9
  • 18
5

I have made an generic extension of @Slauma s example above;

public static class DataExtensions
{
    public static DbContext AddToContext<T>(this DbContext context, object entity, int count, int commitCount, bool recreateContext, Func<DbContext> contextCreator)
    {
        context.Set(typeof(T)).Add((T)entity);

        if (count % commitCount == 0)
        {
            context.SaveChanges();
            if (recreateContext)
            {
                context.Dispose();
                context = contextCreator.Invoke();
                context.Configuration.AutoDetectChangesEnabled = false;
            }
        }
        return context;
    }
}

Usage:

public void AddEntities(List<YourEntity> entities)
{
    using (var transactionScope = new TransactionScope())
    {
        DbContext context = new YourContext();
        int count = 0;
        foreach (var entity in entities)
        {
            ++count;
            context = context.AddToContext<TenancyNote>(entity, count, 100, true,
                () => new YourContext());
        }
        context.SaveChanges();
        transactionScope.Complete();
    }
}
Sgedda
  • 909
  • 1
  • 13
  • 20
5

One of the fastest ways to save a list you must apply the following code

context.Configuration.AutoDetectChangesEnabled = false;
context.Configuration.ValidateOnSaveEnabled = false;

AutoDetectChangesEnabled = false

Add, AddRange & SaveChanges: Doesn't detect changes.

ValidateOnSaveEnabled = false;

Doesn't detect change tracker

You must add nuget

Install-Package Z.EntityFramework.Extensions

Now you can use the following code

var context = new MyContext();

context.Configuration.AutoDetectChangesEnabled = false;
context.Configuration.ValidateOnSaveEnabled = false;

context.BulkInsert(list);
context.BulkSaveChanges();
Reza Jenabi
  • 2,706
  • 18
  • 27
5

Try to use a Stored Procedure that will get an XML of the data that you want to insert.

Maxim
  • 6,930
  • 1
  • 28
  • 42
4

I'm looking for the fastest way of inserting into Entity Framework

There are some third-party libraries supporting Bulk Insert available:

  • Z.EntityFramework.Extensions (Recommended)
  • EFUtilities
  • EntityFramework.BulkInsert

See: Entity Framework Bulk Insert library

Be careful, when choosing a bulk insert library. Only Entity Framework Extensions supports all kind of associations and inheritances and it's the only one still supported.


Disclaimer: I'm the owner of Entity Framework Extensions

This library allows you to perform all bulk operations you need for your scenarios:

  • Bulk SaveChanges
  • Bulk Insert
  • Bulk Delete
  • Bulk Update
  • Bulk Merge

Example

// Easy to use
context.BulkSaveChanges();

// Easy to customize
context.BulkSaveChanges(bulk => bulk.BatchSize = 100);

// Perform Bulk Operations
context.BulkDelete(customers);
context.BulkInsert(customers);
context.BulkUpdate(customers);

// Customize Primary Key
context.BulkMerge(customers, operation => {
   operation.ColumnPrimaryKeyExpression = 
        customer => customer.Code;
});
Jonathan Magnan
  • 9,283
  • 2
  • 28
  • 53
  • 20
    this is a great extension but **not free**. – Okan Kocyigit Apr 01 '17 at 08:54
  • 2
    This answer is pretty good and [EntityFramework.BulkInsert](https://www.nuget.org/packages/EntityFramework.BulkInsert-ef6-ext) performs a bulk insertion of 15K rows in 1.5 seconds, works pretty nice for an internal process like a Windows Service. – Pastor Cortes Jun 27 '17 at 23:57
  • 5
    Yeah, 600$ for bulk insert. Totaly worth it. – eocron Jun 20 '18 at 09:16
  • 1
    @eocron Yeat it's worth it if you use it comercially. I don't see any problem with $600 for something that i don't have to spend hours on building it myself which will cost me a lot more than $600. Yes it costs money but looking at my hourly rate it is money well spend! – Jordy van Eijk Jul 13 '18 at 08:59
4

SqlBulkCopy is super quick

This is my implementation:

// at some point in my calling code, I will call:
var myDataTable = CreateMyDataTable();
myDataTable.Rows.Add(Guid.NewGuid,tableHeaderId,theName,theValue); // e.g. - need this call for each row to insert

var efConnectionString = ConfigurationManager.ConnectionStrings["MyWebConfigEfConnection"].ConnectionString;
var efConnectionStringBuilder = new EntityConnectionStringBuilder(efConnectionString);
var connectionString = efConnectionStringBuilder.ProviderConnectionString;
BulkInsert(connectionString, myDataTable);

private DataTable CreateMyDataTable()
{
    var myDataTable = new DataTable { TableName = "MyTable"};
// this table has an identity column - don't need to specify that
    myDataTable.Columns.Add("MyTableRecordGuid", typeof(Guid));
    myDataTable.Columns.Add("MyTableHeaderId", typeof(int));
    myDataTable.Columns.Add("ColumnName", typeof(string));
    myDataTable.Columns.Add("ColumnValue", typeof(string));
    return myDataTable;
}

private void BulkInsert(string connectionString, DataTable dataTable)
{
    using (var connection = new SqlConnection(connectionString))
    {
        connection.Open();
        SqlTransaction transaction = null;
        try
        {
            transaction = connection.BeginTransaction();

            using (var sqlBulkCopy = new SqlBulkCopy(connection, SqlBulkCopyOptions.TableLock, transaction))
            {
                sqlBulkCopy.DestinationTableName = dataTable.TableName;
                foreach (DataColumn column in dataTable.Columns) {
                    sqlBulkCopy.ColumnMappings.Add(column.ColumnName, column.ColumnName);
                }

                sqlBulkCopy.WriteToServer(dataTable);
            }
            transaction.Commit();
        }
        catch (Exception)
        {
            transaction?.Rollback();
            throw;
        }
    }
}
Philip Johnson
  • 957
  • 9
  • 20
3

Use SqlBulkCopy:

void BulkInsert(GpsReceiverTrack[] gpsReceiverTracks)
{
    if (gpsReceiverTracks == null)
    {
        throw new ArgumentNullException(nameof(gpsReceiverTracks));
    }

    DataTable dataTable = new DataTable("GpsReceiverTracks");
    dataTable.Columns.Add("ID", typeof(int));
    dataTable.Columns.Add("DownloadedTrackID", typeof(int));
    dataTable.Columns.Add("Time", typeof(TimeSpan));
    dataTable.Columns.Add("Latitude", typeof(double));
    dataTable.Columns.Add("Longitude", typeof(double));
    dataTable.Columns.Add("Altitude", typeof(double));

    for (int i = 0; i < gpsReceiverTracks.Length; i++)
    {
        dataTable.Rows.Add
        (
            new object[]
            {
                    gpsReceiverTracks[i].ID,
                    gpsReceiverTracks[i].DownloadedTrackID,
                    gpsReceiverTracks[i].Time,
                    gpsReceiverTracks[i].Latitude,
                    gpsReceiverTracks[i].Longitude,
                    gpsReceiverTracks[i].Altitude
            }
        );
    }

    string connectionString = (new TeamTrackerEntities()).Database.Connection.ConnectionString;
    using (var connection = new SqlConnection(connectionString))
    {
        connection.Open();
        using (var transaction = connection.BeginTransaction())
        {
            using (var sqlBulkCopy = new SqlBulkCopy(connection, SqlBulkCopyOptions.TableLock, transaction))
            {
                sqlBulkCopy.DestinationTableName = dataTable.TableName;
                foreach (DataColumn column in dataTable.Columns)
                {
                    sqlBulkCopy.ColumnMappings.Add(column.ColumnName, column.ColumnName);
                }

                sqlBulkCopy.WriteToServer(dataTable);
            }
            transaction.Commit();
        }
    }

    return;
}
Amir Saniyan
  • 10,990
  • 18
  • 77
  • 124
2

Here is a performance comparison between using Entity Framework and using SqlBulkCopy class on a realistic example: How to Bulk Insert Complex Objects into SQL Server Database

As others already emphasized, ORMs are not meant to be used in bulk operations. They offer flexibility, separation of concerns and other benefits, but bulk operations (except bulk reading) are not one of them.

Zoran Horvat
  • 9,447
  • 3
  • 26
  • 39
2

Another option is to use SqlBulkTools available from Nuget. It's very easy to use and has some powerful features.

Example:

var bulk = new BulkOperations();
var books = GetBooks();

using (TransactionScope trans = new TransactionScope())
{
    using (SqlConnection conn = new SqlConnection(ConfigurationManager
    .ConnectionStrings["SqlBulkToolsTest"].ConnectionString))
    {
        bulk.Setup<Book>()
            .ForCollection(books)
            .WithTable("Books") 
            .AddAllColumns()
            .BulkInsert()
            .Commit(conn);
    }

    trans.Complete();
}

See the documentation for more examples and advanced usage. Disclaimer: I am the author of this library and any views are of my own opinion.

Greg R Taylor
  • 2,695
  • 1
  • 21
  • 16
2

[NEW SOLUTION FOR POSTGRESQL] Hey, I know it's quite an old post, but I have recently run into similar problem, but we were using Postgresql. I wanted to use effective bulkinsert, what turned out to be pretty difficult. I haven't found any proper free library to do so on this DB. I have only found this helper: https://bytefish.de/blog/postgresql_bulk_insert/ which is also on Nuget. I have written a small mapper, which auto mapped properties the way Entity Framework:

public static PostgreSQLCopyHelper<T> CreateHelper<T>(string schemaName, string tableName)
        {
            var helper = new PostgreSQLCopyHelper<T>("dbo", "\"" + tableName + "\"");
            var properties = typeof(T).GetProperties();
            foreach(var prop in properties)
            {
                var type = prop.PropertyType;
                if (Attribute.IsDefined(prop, typeof(KeyAttribute)) || Attribute.IsDefined(prop, typeof(ForeignKeyAttribute)))
                    continue;
                switch (type)
                {
                    case Type intType when intType == typeof(int) || intType == typeof(int?):
                        {
                            helper = helper.MapInteger("\"" + prop.Name + "\"",  x => (int?)typeof(T).GetProperty(prop.Name).GetValue(x, null));
                            break;
                        }
                    case Type stringType when stringType == typeof(string):
                        {
                            helper = helper.MapText("\"" + prop.Name + "\"", x => (string)typeof(T).GetProperty(prop.Name).GetValue(x, null));
                            break;
                        }
                    case Type dateType when dateType == typeof(DateTime) || dateType == typeof(DateTime?):
                        {
                            helper = helper.MapTimeStamp("\"" + prop.Name + "\"", x => (DateTime?)typeof(T).GetProperty(prop.Name).GetValue(x, null));
                            break;
                        }
                    case Type decimalType when decimalType == typeof(decimal) || decimalType == typeof(decimal?):
                        {
                            helper = helper.MapMoney("\"" + prop.Name + "\"", x => (decimal?)typeof(T).GetProperty(prop.Name).GetValue(x, null));
                            break;
                        }
                    case Type doubleType when doubleType == typeof(double) || doubleType == typeof(double?):
                        {
                            helper = helper.MapDouble("\"" + prop.Name + "\"", x => (double?)typeof(T).GetProperty(prop.Name).GetValue(x, null));
                            break;
                        }
                    case Type floatType when floatType == typeof(float) || floatType == typeof(float?):
                        {
                            helper = helper.MapReal("\"" + prop.Name + "\"", x => (float?)typeof(T).GetProperty(prop.Name).GetValue(x, null));
                            break;
                        }
                    case Type guidType when guidType == typeof(Guid):
                        {
                            helper = helper.MapUUID("\"" + prop.Name + "\"", x => (Guid)typeof(T).GetProperty(prop.Name).GetValue(x, null));
                            break;
                        }
                }
            }
            return helper;
        }

I use it the following way (I had entity named Undertaking):

var undertakingHelper = BulkMapper.CreateHelper<Model.Undertaking>("dbo", nameof(Model.Undertaking));
undertakingHelper.SaveAll(transaction.UnderlyingTransaction.Connection as Npgsql.NpgsqlConnection, undertakingsToAdd));

I showed an example with transaction, but it can also be done with normal connection retrieved from context. undertakingsToAdd is enumerable of normal entity records, which I want to bulkInsert into DB.

This solution, to which I've got after few hours of research and trying, is as you could expect much faster and finally easy to use and free! I really advice you to use this solution, not only for the reasons mentioned above, but also because it's the only one with which I had no problems with Postgresql itself, many other solutions work flawlessly for example with SqlServer.

Michał Pilarek
  • 141
  • 1
  • 10
2

Yes, SqlBulkUpdate is indeed the fastest tool for this type of task. I wanted to find "least effort" generic way for me in .NET Core so I ended up using great library from Marc Gravell called FastMember and writing one tiny extension method for entity framework DB context. Works lightning fast:

using System.Collections.Generic;
using System.Linq;
using FastMember;
using Microsoft.Data.SqlClient;
using Microsoft.EntityFrameworkCore;

namespace Services.Extensions
{
    public static class DbContextExtensions
    {
        public static void BulkCopyToServer<T>(this DbContext db, IEnumerable<T> collection)
        {
            var messageEntityType = db.Model.FindEntityType(typeof(T));

            var tableName = messageEntityType.GetSchema() + "." + messageEntityType.GetTableName();
            var tableColumnMappings = messageEntityType.GetProperties()
                .ToDictionary(p => p.PropertyInfo.Name, p => p.GetColumnName());

            using (var connection = new SqlConnection(db.Database.GetDbConnection().ConnectionString))
            using (var bulkCopy = new SqlBulkCopy(connection))
            {
                foreach (var (field, column) in tableColumnMappings)
                {
                    bulkCopy.ColumnMappings.Add(field, column);
                }

                using (var reader = ObjectReader.Create(collection, tableColumnMappings.Keys.ToArray()))
                {
                    bulkCopy.DestinationTableName = tableName;
                    connection.Open();
                    bulkCopy.WriteToServer(reader);
                    connection.Close();
                }
            }
        }
    }
}
Michal Hosala
  • 5,145
  • 1
  • 18
  • 43
1

Have you ever tried to insert through a background worker or task?

In my case, im inserting 7760 registers, distributed in 182 different tables with foreign key relationships ( by NavigationProperties).

Without the task, it took 2 minutes and a half. Within a Task ( Task.Factory.StartNew(...) ), it took 15 seconds.

Im only doing the SaveChanges() after adding all the entities to the context. (to ensure data integrity)

Rafael A. M. S.
  • 619
  • 1
  • 6
  • 26
  • 2
    I am pretty sure that the context isn't thread safe. Do you have tests to ensure that all the entities were saved? – Danny Varod Jun 29 '13 at 09:58
  • I know the entire entity framework isnt thread safe at all, but im just adding the objects to the context and saving at the end... Its working perfectly here. – Rafael A. M. S. Jul 22 '13 at 12:53
  • So, You are calling DbContext.SaveChanges() in main thread, but adding entities to context is performed in background thread, right? – Prokurors Mar 31 '14 at 13:18
  • 1
    Yes, add data inside the threads; wait for all to finish; and Save Changes in main thread – Rafael A. M. S. Apr 01 '14 at 17:31
  • Although I think this way is dangerous and prone to mistakes, I find it very interesting. – Learner Jul 09 '14 at 10:03
1

All the solutions written here don't help because when you do SaveChanges(), insert statements are sent to database one by one, that's how Entity works.

And if your trip to database and back is 50 ms for instance then time needed for insert is number of records x 50 ms.

You have to use BulkInsert, here is the link: https://efbulkinsert.codeplex.com/

I got insert time reduced from 5-6 minutes to 10-12 seconds by using it.

Aleksa
  • 2,608
  • 3
  • 26
  • 39
1

As per my knowledge there is no BulkInsert in EntityFramework to increase the performance of the huge inserts.

In this scenario you can go with SqlBulkCopy in ADO.net to solve your problem

anishMarokey
  • 10,805
  • 2
  • 31
  • 45
0

You may use Bulk package library. Bulk Insert 1.0.0 version is used in projects having Entity framework >=6.0.0 .

More description can be found here- Bulkoperation source code

Nadeem
  • 31
  • 1
  • 4
0

The secret is to insert into an identical blank staging table. Inserts are lightening quick. Then run a single insert from that into your main large table. Then truncate the staging table ready for the next batch.

ie.

insert into some_staging_table using Entity Framework.

-- Single insert into main table (this could be a tiny stored proc call)
insert into some_main_already_large_table (columns...)
   select (columns...) from some_staging_table
truncate table some_staging_table
Simon Hughes
  • 3,425
  • 3
  • 20
  • 44
  • Using EF, add all your records to an empty staging table. Then use SQL to insert into the main (large and slow) table in a *single* SQL instruction. Then empty your staging table. It's a very fast way of inserting a lot of data into an already large table. – Simon Hughes Jul 18 '12 at 10:41
  • 13
    When you say using EF, add the records to the staging table, did you actually try this with EF? Since EF issues a separate call to the database with each insert, I suspect you are going to see the same perf hit that the OP is trying to avoid. How does the staging table avoid this issue? – Jim Wooley Jul 17 '13 at 14:25
0

TL;DR I know it is an old post, but I have implemented a solution starting from one of those proposed by extending it and solving some problems of this; moreover I have also read the other solutions presented and compared to these it seems to me to propose a solution that is much more suited to the requests formulated in the original question.

In this solution I extend Slauma's approach which I would say is perfect for the case proposed in the original question, and that is to use Entity Framework and Transaction Scope for an expensive write operation on the db.

In Slauma's solution - which incidentally was a draft and was only used to get an idea of ​​the speed of EF with a strategy to implement bulk-insert - there were problems due to:

  1. the timeout of the transaction (by default 1 minute extendable via code to max 10 minutes);
  2. the duplication of the first block of data with a width equal to the size of the commit used at the end of the transaction (this problem is quite weird and circumvented by means of a workaround).

I also extended the case study presented by Slauma by reporting an example that includes the contextual insertion of several dependent entities.

The performances that I have been able to verify have been of 10K rec/min inserting in the db a block of 200K wide records approximately 1KB each. The speed was constant, there was no degradation in performance and the test took about 20 minutes to run successfully.

The solution in detail

the method that presides over the bulk-insert operation inserted in an example repository class:

abstract class SomeRepository { 

    protected MyDbContext myDbContextRef;

    public void ImportData<TChild, TFather>(List<TChild> entities, TFather entityFather)
            where TChild : class, IEntityChild
            where TFather : class, IEntityFather
    {

        using (var scope = MyDbContext.CreateTransactionScope())
        {

            MyDbContext context = null;
            try
            {
                context = new MyDbContext(myDbContextRef.ConnectionString);

                context.Configuration.AutoDetectChangesEnabled = false;

                entityFather.BulkInsertResult = false;
                var fileEntity = context.Set<TFather>().Add(entityFather);
                context.SaveChanges();

                int count = 0;

                //avoids an issue with recreating context: EF duplicates the first commit block of data at the end of transaction!!
                context = MyDbContext.AddToContext<TChild>(context, null, 0, 1, true);

                foreach (var entityToInsert in entities)
                {
                    ++count;
                    entityToInsert.EntityFatherRefId = fileEntity.Id;
                    context = MyDbContext.AddToContext<TChild>(context, entityToInsert, count, 100, true);
                }

                entityFather.BulkInsertResult = true;
                context.Set<TFather>().Add(fileEntity);
                context.Entry<TFather>(fileEntity).State = EntityState.Modified;

                context.SaveChanges();
            }
            finally
            {
                if (context != null)
                    context.Dispose();
            }

            scope.Complete();
        }

    }

}

interfaces used for example purposes only:

public interface IEntityChild {

    //some properties ...

    int EntityFatherRefId { get; set; }

}

public interface IEntityFather {

    int Id { get; set; }
    bool BulkInsertResult { get; set; }
}

db context where I implemented the various elements of the solution as static methods:

public class MyDbContext : DbContext
{

    public string ConnectionString { get; set; }


    public MyDbContext(string nameOrConnectionString)
    : base(nameOrConnectionString)
    {
        Database.SetInitializer<MyDbContext>(null);
        ConnectionString = Database.Connection.ConnectionString;
    }


    /// <summary>
    /// Creates a TransactionScope raising timeout transaction to 30 minutes
    /// </summary>
    /// <param name="_isolationLevel"></param>
    /// <param name="timeout"></param>
    /// <remarks>
    /// It is possible to set isolation-level and timeout to different values. Pay close attention managing these 2 transactions working parameters.
    /// <para>Default TransactionScope values for isolation-level and timeout are the following:</para>
    /// <para>Default isolation-level is "Serializable"</para>
    /// <para>Default timeout ranges between 1 minute (default value if not specified a timeout) to max 10 minute (if not changed by code or updating max-timeout machine.config value)</para>
    /// </remarks>
    public static TransactionScope CreateTransactionScope(IsolationLevel _isolationLevel = IsolationLevel.Serializable, TimeSpan? timeout = null)
    {
        SetTransactionManagerField("_cachedMaxTimeout", true);
        SetTransactionManagerField("_maximumTimeout", timeout ?? TimeSpan.FromMinutes(30));

        var transactionOptions = new TransactionOptions();
        transactionOptions.IsolationLevel = _isolationLevel;
        transactionOptions.Timeout = TransactionManager.MaximumTimeout;
        return new TransactionScope(TransactionScopeOption.Required, transactionOptions);
    }

    private static void SetTransactionManagerField(string fieldName, object value)
    {
        typeof(TransactionManager).GetField(fieldName, BindingFlags.NonPublic | BindingFlags.Static).SetValue(null, value);
    }


    /// <summary>
    /// Adds a generic entity to a given context allowing commit on large block of data and improving performance to support db bulk-insert operations based on Entity Framework
    /// </summary>
    /// <typeparam name="T"></typeparam>
    /// <param name="context"></param>
    /// <param name="entity"></param>
    /// <param name="count"></param>
    /// <param name="commitCount">defines the block of data size</param>
    /// <param name="recreateContext"></param>
    /// <returns></returns>
    public static MyDbContext AddToContext<T>(MyDbContext context, T entity, int count, int commitCount, bool recreateContext) where T : class
    {
        if (entity != null)
            context.Set<T>().Add(entity);

        if (count % commitCount == 0)
        {
            context.SaveChanges();
            if (recreateContext)
            {
                var contextConnectionString = context.ConnectionString;
                context.Dispose();
                context = new MyDbContext(contextConnectionString);
                context.Configuration.AutoDetectChangesEnabled = false;
            }
        }

        return context;
    }
}
Ciro Corvino
  • 1,698
  • 5
  • 15
  • 29
-1

Configuration.LazyLoadingEnabled = false; Configuration.ProxyCreationEnabled = false;

these are too effect to speed without AutoDetectChangesEnabled = false; and i advise to use different table header from dbo. generally i use like nop,sop,tbl etc..

SelcukBah
  • 39
  • 5
-2

But, for more than (+4000) inserts i recommend to use stored procedure. attached the time elapsed. I did inserted it 11.788 rows in 20"enter image description here

thats it code

 public void InsertDataBase(MyEntity entity)
    {
        repository.Database.ExecuteSqlCommand("sp_mystored " +
                "@param1, @param2"
                 new SqlParameter("@param1", entity.property1),
                 new SqlParameter("@param2", entity.property2));
    }
Marinpietri
  • 92
  • 3
  • 8
-2

Use stored procedure that takes input data in form of xml to insert data.

From your c# code pass insert data as xml.

e.g in c#, syntax would be like this:

object id_application = db.ExecuteScalar("procSaveApplication", xml)
arun tiwari
  • 121
  • 1
  • 3
  • 16
-6

Use this technique to increase the speed of inserting records in Entity Framework. Here I use a simple stored procedure to insert the records. And to execute this stored procedure I use .FromSql() method of Entity Framework which executes Raw SQL.

The stored procedure code:

CREATE PROCEDURE TestProc
@FirstParam VARCHAR(50),
@SecondParam VARCHAR(50)

AS
  Insert into SomeTable(Name, Address) values(@FirstParam, @SecondParam) 
GO

Next, loop through all your 4000 records and add the Entity Framework code which executes the stored

procedure onces every 100th loop.

For this I create a string query to execute this procedure, keep on appending to it every sets of record.

Then check it the loop is running in the multiples of 100 and in that case execute it using .FromSql().

So for 4000 records I only have to execute the procedure for only 4000/100 = 40 times.

Check the below code:

string execQuery = "";
var context = new MyContext();
for (int i = 0; i < 4000; i++)
{
    execQuery += "EXEC TestProc @FirstParam = 'First'" + i + "'', @SecondParam = 'Second'" + i + "''";

    if (i % 100 == 0)
    {
        context.Student.FromSql(execQuery);
        execQuery = "";
    }
}
Community
  • 1
  • 1
Mallory H
  • 17
  • 3
  • This might be efficient but equivalent to NOT using entity framework. The OP question was how to maximize efficiency in the context of Entity Framework – kall2sollies Sep 13 '19 at 13:11
  • I have historically used this approach and it proved to be very efficient. Question to the community, are there any drawbacks to using this approach? – Timothy Macharia Jun 15 '20 at 18:42
  • Could be useful in a pinch. (e.g. One of our Production sites suddently ramped up hits recently, causing a scramble to increase throughput, both short term and long term). – Zeek2 Oct 02 '20 at 08:55