I have a background task (started by IHostedService
) that runs every 7 days, this task copies data into the same table only changing the PKs to generate dummy data (it's used for demos).
The issue is when it tries to save the new data for the second table (approx 2k new records) Entity Framework never completes the SaveChangesAsync
this process is blocked and it starts consuming the whole RAM available.
Note: The first table sometimes copies more than 20k new records.
This is my current code, I'm using Entity Framework Core 5.0.4 with .NET Core 3.1:
using (var context = _context.CreateNewInstance())
{
var existingStudents = context.Students.Where(s => s.UniversityId == _destUniversity.Id);
var sourceStudents = context.Students.Where(s => s.UniversityId == _sourceUniversity.Id)
.Select(s => new Student()
{
//...properties
});
var newStudents = sourceStudents.Where(s => !existingStudents.Any(es => es.DiffKey == s.DiffKey)).ToArray();
if (newStudents.Length == 0)
{
return;
}
await context.Students.AddRangeAsync(newStudents);
await context.SaveChangesAsync(_cancellationToken.Token);
}
I've tried all this:
- Disabling "Auto Detect Changes"
- Batch insert: this only works with the first batch.
- Use
IEnumerable
instead ofIQueryable
- A single context for all tables in the task
- New context instance for each table
- Compared the configuration of the databases/tables and replicated it on local
I have 2 servers (1 local, 1 remote) with 3 databases (1 local, 2 remote), the code works on two of the three databases (1 local and 1 remote).
I used this code to get the spid
of the task on SQL:
int psid;
using (var command = contextTmp.Database.GetDbConnection().CreateCommand())
{
command.CommandText = "select @@spid as id";
contextTmp.Database.OpenConnection();
using (var result = command.ExecuteReader())
{
result.Read();
psid = result.GetInt16("id");
}
contextTmp.Database.CloseConnection();
}
With the spid
from the previous code I've monitored the requests on the SQL Server side to check its status:
select session_id,
status,
command,
blocking_session_id,
wait_type,
wait_time,
last_wait_type,
wait_resource
from sys.dm_exec_requests
where session_id = @id
Result:
session_id | status | command | blocking_session_id | wait_type | wait_time | last_wait_type | wait_resource |
---|---|---|---|---|---|---|---|
84 | suspended | SELECT | 0 | ASYNC_NETWORK_IO | 35 | ASYNC_NETWORK_IO |
I've read blogs about this issue but I don't get it yet, I think I missing something. I ran out of ideas.
Other sources:
- DBA stackexchange
- Understanding how SQL Server executes a query
- SQL Server ASYNC Network IO Wait Type
- Database Connection Hazards with Entity Framework
UPDATE(1):
Before we've updated from .NET Core 2.2 to 3.1 all functions worked well. We started to see strange behaviors after the upgrade.
Any suggestions?