1

I have written a method to bulk update/delete/insert rows into a database table using Entity Framework. I have pasted the code below. The DBTable has 23 columns and the DBTableRow is a class which has properties that map to each column of the DBTable. The input parameters to the method are IEnumerables, on which some comparison is done using a custom equality comparer to arrive at the list of rows that need to be added, deleted or modified. Typically, the enumerable can get to a size of 50000-60000.

The problem I am facing is with the method's slowness. For a net of 200 rows (across all operations - add, delete and update), it takes 30 minutes. For a net of 2000 rows, it has taken close to 6 hours and hasn't finished yet.

Could the experts point out the performance bottlenecks in the code? Thanks a lot in advance...

private void InsertIntoDB(DbContext dbContext, IEnumerable<DBTableRow> fromLatestB, IEnumerable<DBTableRow> olderB,
                                             IEnumerable<DBTableRow> toBeAddedB, IEnumerable<DBTableRow> toBeDeletedB,
                                             IEnumerable<DBTableRow> toBeModifiedB, IQueryable<int> listMultiple)
{
    dbContext.Configuration.AutoDetectChangesEnabled = false;
    dbContext.Configuration.ValidateOnSaveEnabled = false;

    int TypeId = 30;    

    if (toBeAddedB != null && toBeAddedB.Any())
        toBeAddedB.ToList().ForEach(s => dbContext.DBTable.Add(s));
    if (toBeDeletedB != null && toBeDeletedB.Any())
    {
        toBeDeletedB.ToList().ForEach(s =>
        {
            if (s.Type == TypeId)
            {
                var rlRows = dbContext.DBTable.Where(x => x.Type == TypeId && x.Url.Equals(s.Url, StringComparison.OrdinalIgnoreCase));
                if (rlRows != null && rlRows.Any())
                {
                    rlRows.ToList().ForEach(y =>
                    {
                        if (dbContext.Entry(y).State == EntityState.Detached)
                            dbContext.DBTable.Attach(y);

                            dbContext.DBTable.Remove(y);
                    });
                }
            }
            else
            {
                dbContext.DBTable.Attach(s);
                dbContext.DBTable.Remove(s);
            }
        });
    }
    if (toBeModifiedB != null && toBeModifiedB.Any())
    {
        var eqComp = new CustomEqualityComparer(listMultiple);
        var toBeModifiedNew = fromLatestB.Intersect(olderB, new CustomEqualityComparer(true, listMultiple));
        toBeModifiedB.ToList().ForEach(x =>
        {
            var rowNew = ReturnRowFromModifiedNewList();

            if (rowNew != null)
            {
                x.Type = rowNew.Type;
                x.Url = rowNew.Url;
                x.Data = rowNew.Data;
                x.LastModified = DateTime.UtcNow;

                dbContext.Entry(x).State = EntityState.Modified;
            }
        });
    }

    dbContext.SaveChanges();

    dbContext.Configuration.AutoDetectChangesEnabled = true;
    dbContext.Configuration.ValidateOnSaveEnabled = true;
}
mantadt
  • 91
  • 1
  • 13
  • To insert large amount of data you need to use Bulk operations as described here http://stackoverflow.com/questions/5940225/fastest-way-of-inserting-in-entity-framework – Sam Zakhezin Mar 10 '17 at 13:44

2 Answers2

3

Any

The Any method look great since you check if the enumerable contains entities but are normally very bad on enumerable since you may enumerate more than once.

By example, in the delete part, two database round trip is required.

  • Once for the Any method
  • Once for the ToList method

Example:

if (toBeDeletedB != null && toBeDeletedB.Any())
{
    toBeDeletedB.ToList().ForEach(s =>

So perform the ToList before calling the Any method

if (toBeDeletedB != null)
{
    var toBeDeletedBList = toBeDeletedB.ToList();

    toBeDeletedBList.ForEach(s => ...

The same mistake can happen everywhere you are using the Any method.

toBeAdded

Everything seems perfect here.

Because you set AutoDetectChangesEnabled to false, Add && AddRange will provide around the same performance.

toBeDeleted

For every entity you delete, you make a database round-trip (twice since you use Any)

This line is a performance issue:

var rlRows = dbContext.DBTable.Where(x => x.Type == TypeId && x.Url.Equals(s.Url, StringComparison.OrdinalIgnoreCase));

You should instead:

  • Take all items with the Type == TypeId from toBeDeletedB
  • Use contains methods or similar to reduce database round-trip required. The contains method may not work depending on collations

Example

var toBeDeletedBList = toBeDeletedB.ToList();

var listA = toBeDeletedBList.Where(x => x.Type == TypeId);
var listB = toBeDeletedBList.Where(x => x.Type != TypeId);

var rlRows = dbContext.DBTable.Where(x => x.Type == typeId && listA.Contains(s.Url);

listB.ForEach(s => {
    dbContext.DBTable.Attach(s);
    dbContext.DBTable.Remove(s);
});

toBeModified

I'm not sure exactly what the CustomEqualityComparer method do, but again, you may have an issue with performing multiple queries on the listMultiple IQueryable.

SaveChanges

For every entity you need to insert, update, or delete, a database round-trip is performed.

So you if need to perform operation on 50000 rows, 50000 database round-trip is performed with is INSANE

Disclaimer: I'm the owner of Entity Framework Extensions

This library allows you to perform bulk operations and improve performance.

By example, the BulkSaveChanges is exactly like SaveChanges but way faster by dramatically reducing the database round-trip required.

  • Bulk SaveChanges
  • Bulk Insert
  • Bulk Delete
  • Bulk Update
  • Bulk Merge

Example

// Easy to use
context.BulkSaveChanges();

// Easy to customize
context.BulkSaveChanges(bulk => bulk.BatchSize = 100);

// Perform Bulk Operations
context.BulkDelete(endItems);
context.BulkInsert(endItems);
context.BulkUpdate(endItems);

// Customize Primary Key
context.BulkMerge(endItems, operation => {
   operation.ColumnPrimaryKeyExpression = 
        endItem => endItem.Code;
});
Jonathan Magnan
  • 9,283
  • 2
  • 28
  • 53
  • 1
    Thank you @Jonathan for the detailed solution. Much appreciated. I will take a look and try to implement the review comments suggested. – mantadt Mar 15 '17 at 13:17
  • Thanks @jonathan-magnan for the pointer about multiple enumerations. The function **ReturnRowFromModifiedNewList()** in the **toBeModified** section had the same enumeration getting executed for every iteration of the **toBeModifiedB** list object(this list could get as big as 2000-10000). A serious blunder!! I am accepting this answer. – mantadt Mar 15 '17 at 15:10
1

Have you looked at using the AddRange() & RemoveRange() methods? New in EF6 I believe.

John McArthur
  • 732
  • 1
  • 11
  • 24
  • Thank you @john. I will have a look at these methods. – mantadt Mar 15 '17 at 13:10
  • AddRange and RemoveRange will still do multiple calls to the database (one per record), entity framework by default cannot handle bulk inserts or deletes – pau Fer Jul 14 '20 at 07:30