1

In one of my application methods I'm dealing with a lot of entries that I have to save at the same time:

foreach (CategoryType category in apiCall.CategoryList)
{
    EBAY_CATEGORY itemToAdd = new EBAY_CATEGORY
        {
            CATEGORY_ID = int.Parse(category.CategoryID),
            CATEGORY_LEVEL = category.CategoryLevel,
            NAME = category.CategoryName,
            LEAF = category.LeafCategory,
            EXPIRED = category.Expired,
            VIRTUAL = category.Virtual
        };

    int? parentId = Int32.Parse(category.CategoryParentID[0]);

    if (parentId != null)
    {
        itemToAdd.PARENTID = parentId;
    }

    db.EBAY_CATEGORY.Add(itemToAdd);

    db.Entry(itemToAdd).State = EntityState.Added;

    db.SaveChanges();
}

(Yup, I'm currently dealing with eBay). When this loop starts, I have downloaded up to about 20 000 items in a collection of complex items.

So you see that I have initiated a foreach to make sure that I save each of the entries in my database, but since it's a lot of operations it also take a lot of time. What would be the best way to deal with this collection of items?

EDIT So, following some very useful suggestions down below, here's my new code!

using (TransactionScope scope = new TransactionScope())
{
    MyContext db = null;

    try
    {
        db = new MyContext();

        db.Database.Connection.Open();

        db.Configuration.AutoDetectChangesEnabled = false;
        db.Configuration.ValidateOnSaveEnabled = false;

        int count = 0;

        ApiContext context = GeteBayApiContext();

        GetCategoriesCall apiCall = new GetCategoriesCall(context)
        {
            EnableCompression = true,
            ViewAllNodes = true
        };

        apiCall.DetailLevelList.Add(DetailLevelCodeType.ReturnAll);

        apiCall.GetCategories();

        foreach (CategoryType category in apiCall.CategoryList)
        {
            EBAY_CATEGORY itemToAdd = new EBAY_CATEGORY
            {
                CATEGORY_ID = int.Parse(category.CategoryID),
                CATEGORY_LEVEL = category.CategoryLevel,
                NAME = category.CategoryName,
                LEAF = category.LeafCategory,
                EXPIRED = category.Expired,
                VIRTUAL = category.Virtual
            };

            int? parentId = Int32.Parse(category.CategoryParentID[0]);

            if (parentId != null)
            {
                itemToAdd.PARENTID = parentId;
            }
            count++;
            db = AddToContext(db, itemToAdd, count, 2000, true);
        }

        db.SaveChanges();
    }
    finally 
    {
        if (db != null)
        {
            db.Dispose();
        }
    }

    scope.Complete();
}

And the AddToContext method:

private static MyContext AddToContext(MyContext context, EBAY_CATEGORY itemToAdd, int count, int commitCount,

bool recreateContext)
{
    context.Set<EBAY_CATEGORY>().Add(itemToAdd);

    if (count%commitCount == 0)
    {
        context.SaveChanges();

        if (recreateContext)
        {
            context.Dispose();
            context = new MyContext();
            context.Configuration.AutoDetectChangesEnabled = false;
            context.Configuration.ValidateOnSaveEnabled = false;
        }
    }

    return context;
}

It works fine and a lot faster, but each time the Savechanges method is called after a bulk of data has been inserted, I get the following error:

The transaction associated with the current connection has completed but has not been disposed.  The transaction must be disposed before the connection can be used to execute SQL statements.

This occurs after a bulk of 2000 data entries, or sometimes after a few shots of 100 data entries. I don't understand what's wrong.

hsim
  • 1,870
  • 6
  • 28
  • 65

2 Answers2

2

Do not use db.SaveChanges() inside the loop and use db transactions.

You may get an OutOfMemory Exception if you have too many uncommitted changes as shown here Fastest Way of Inserting in Entity Framework

So you would have to use modulo to db.SaveChanges() every x entries. Depends on how large your data is whether you need that at all.

Community
  • 1
  • 1
BlueM
  • 5,704
  • 1
  • 21
  • 28
  • That would speed up things that much? – hsim Mar 17 '14 at 17:51
  • For sure. It will allow the framework to batch insert data instead of firing one command after another to the database for each table entry. – BlueM Mar 17 '14 at 17:53
  • So you are suggesting that I enfold everything in a whole transaction and call `db.SaveChanges()` after the scope has completed or right before? – hsim Mar 17 '14 at 17:54
  • Perfect answer can be found in http://stackoverflow.com/questions/5940225/fastest-way-of-inserting-in-entity-framework – BlueM Mar 17 '14 at 17:55
  • Thanks for the information, that's a nice way of dealing with huge collections! – hsim Mar 17 '14 at 18:05
  • It does speed up things a lot, but I have a problem. I'll update the code above with the trouble. The following error : `The transaction associated with the current connection has completed but has not been disposed. The transaction must be disposed before the connection can be used to execute SQL statements.` shows after a few savechanges. – hsim Mar 17 '14 at 18:22
  • Done up there, have a look. I have updated the code, there was a typo in it. – hsim Mar 17 '14 at 18:26
  • 1
    Did you try to increase the transaction timeout? `using (TransactionScope scope = new TransactionScope(TransactionScopeOption.Required, new System.TimeSpan(0, 15, 0)))` – BlueM Mar 17 '14 at 18:29
  • Nope, I can try this. – hsim Mar 17 '14 at 18:30
0

You can do database updates in batches to increase the speed. I am not sure which DB you are using here but most allow for fairly large batches.

Ben Echols
  • 479
  • 2
  • 8