Multi Threading Entity Framework and MARS
Recently I’ve had some persistent exceptions coming from some backend worker processes running on Azure. These worker role jobs are multi threaded processes that constantly run as long as the service has been started. Entity Framework, via a repository and unit of work pattern (thank you Ninject) is the backing persistent data store.
This issue was encountered when we started adding more and more backend jobs, when in turn added, threads to the Resgrid system. Resgrid is a cloud based logistics and management system for first responder organization, like volunteer and career fire departments, EMS, public safety, search and rescue, HAZMAT and others. I started Resgrid in late 2012 with a partner, staxmanade.
When I first started using Entity Framework on the first errors encountered was:
System.InvalidOperationException: There is already an open DataReader associated with this Command which must be closed first.
Some very quick searching provided the answer, just add MultipleActiveResultSets=true to your connection and BAM problem solved. For the MVC Site, WebAPI, Tests and pretty much everything else, this works great. But I didn’t know what MultipleActiveResultSets, or MARS, was or what it’s drawbacks were.
MARS is not a feature of Entity Framework, but was added to SQL Server with SQL Server 2005, it just happens to help a lot with Entity Framework and lazy loading scenarios. MARS allows you to have multiple pending requests, say two reads, on a single SQL Server Connection. This saves the overhead of having to open another connection to the SQL server from the connection pool. Because of this MARS can improve performance when you have multiple concurrent operations trying to hit the database, the existing open connection is reused.
So what are the drawbacks of MARS? Well the primary one you will find on the Internet is that there is additional overhead creating a MARS connection as opposed to a standard ADO.Net SQL connection. So if your just going to execute a single query and close the connection the MARS connection overhead will have a performance impact.
Some errors we were seeing in our logs were:
An error was reported while committing a database transaction but it could not be determined whether the transaction succeeded or failed on the database server. See the inner exception and http://go.microsoft.com/fwlink/?LinkId=313468 for more information. ---> System.Data.Entity.Core.EntityException: The underlying provider failed on Commit. ---> System.Data.SqlClient.SqlException: The transaction operation cannot be performed because there are pending requests working on this transaction.
An error occurred while updating the entries. See the inner exception for details. ---> System.Data.SqlClient.SqlException: The server failed to resume the transaction. Desc:418000009d4. The transaction active in this session has been committed or aborted by another session.
The underlying provider failed on Open. ---> System.InvalidOperationException: The connection was not closed. The connection's current state is connecting
ExecuteNonQuery requires the command to have a transaction when the connection assigned to the command is in a pending local transaction. The Transaction property of the command has not been initialized.
An error occurred while executing the command definition. See the inner exception for details. ---> System.Data.SqlClient.SqlException: The server failed to resume the transaction.
Depending on the volume of the system we could get hundreds of errors a day, totally unacceptable. But almost impossible to track down. Every type of exception lead us down another path of possible solutions, with some people suggesting that when encountered you need to re-architect your data access strategy. Not all of those errors may be directly related to MARS, but they are included because they happen at the same time.
The issue seems to be that because MARS is enabled, the multiple threads were getting the same connection, even though a new DataContext was being created (Transnet or ThreadScoped). Because the same underlying connection was being shared, depending on the lifecycle of the other threads they would pile on the connection before it was ready, as it was being torn down, or while another transaction was in progress.
Disabling MARS for the workers seems to have fixed the issue, no more exceptions being generated. But now we still have our Lazy loading issue that MARS solved for us. Unfortunately the solution for this is to not lazy load or delay execution via IQueryable. Instead invoke your result set as soon as you can to try and avoid the active DataReader exception.