I have an application that (among other features) stores PDF documents into a byte field in a database. Entity framework is used for all data access and handled through a repository class. The EF container is stored in the repository class and persists as long as the repository object does.
I store each document one at a time into the database. I know this is not as efficient as loading a batch at a time, but I have to do additional processing after it has been inserted.
What I cannot figure out is the large amount of memory this application is using, which is slowing the application down considerably. I will push in about 5000 PDFs at a time. It will run very quickly for the first 500 or so PDFs and then slows to a crawl. At this point the memory usage of this console application is up to around 1.5GB.
Here is the repository method call. The SaveChanges() method just calls a the save changes method of the container and then returns true/false depending on result.
public bool AddDocument(Document document)
{
dataContainer.Documents.Add(document);
return SaveChanges();
}
The document class is...
public partial class Document
{
public Document()
{
this.Name = "";
this.Filename = "";
}
public int Id { get; set; }
public string Name { get; set; }
public string Filename { get; set; }
public byte[] Data { get; set; }
}
I have used the ANTS Memory Profiler and found that the memory skyrockets during the .Add(document) call. I think I am getting hit with some lazy loading that is populating the Documents collection.
How can I stop my app from running out of control with memory in hopes that it will speed up operations?