Q&A
Entity Framework Performance Monitoring and Tuning with Jim Wooley
Object Relational Mapping (ORM) technologies like Entity Framework can drastically simplify the programming model against databases, but when done carelessly, performance can suffer.
To keep that from happening, data developers use techniques such as monitoring, profiling database interactions and fine-tuning entity queries.
However, all the manuals, guides, books, videos and other resources available to help with those techniques are no match for real-world experience earned on the front lines. Jim Wooley, a solution architect at Slalom Consulting, is using such experience to share his expertise on getting the most out of Entity Framework at a five-day Visual Studio Live! developer conference next month in Nashville. The title of his presentation is "Entity Framework Performance Monitoring and Tuning."
"In this session, we'll explore some real-life strategies that have solved performance issues in production applications that I have encountered at a variety of customers," Wooley said. "We'll discuss the pros and cons of working without stored procs and cover strategies for monitoring and tuning your entity queries."
We caught up with Wooley to learn more in a short Q&A.
VisualStudioMagazine: How can Object Relational Mapping technologies like Entity Framework simplify the programming model against databases?
Wooley: ORM tools like Entity Framework (EF) reduce the need for developers to have to worry about managing the plumbing going between their application and the database and rather lets them focus on adding business value in a type-safe manner (through LINQ). Developers don't need to worry about managing database connections, updating stored procedures for each operation, or learning subtle differences between the various SQL dialects (TSQL, PLSQL, pgSQL, etc.). As long as the need is to fetch an object or object graph, make changes to that structure and update it, OR/M tools can easily handle the 80-90 percent use cases easily without causing performance concerns.
Additionally, as the platform continues to grow and the framework team discovers performance optimizations (like object serialization), developers can take advantage of performance improvements by just upgrading to the newest version and not need to change any of their application code. For example, one subtle case of performance hits in versions of EF prior to EF Core 6.0 happened when projecting child collections of a parent where EF added an "order by" clause to make object hydration faster. However, if the foreign key wasn't included in the covering index, including it in the "order by" clause could cause the execution plan to pick a sub-optimal index. EF Core 6 removed this limitation and as a result upgrading the dependency and re-compiling the application may now automagically see a performance improvement.
"Every version of Entity Framework Core has included performance enhancements over the previous one to the point where performance differences between EF, Dapper, and hand-rolled ADO code are negligible."
Jim Wooley, Solution Architect, Slalom Consulting
Every version of Entity Framework Core has included performance enhancements over the previous one to the point where performance differences between EF, Dapper, and hand-rolled ADO code are negligible.
What are some common careless mistakes that developers or database pros make when using EF that affect performance?
The number one biggest issue that I see in customer code is that they don't pay attention. While ORM tools reduce the need to manually manage the plumbing, developers still need to be watching for leaks that can cause major issues down the road.
For example, I once had a customer that was running into a performance issue that turned out to be caused by lazy loading children records four levels down in a nested loops. To make matters worse, the code was only logging the results (to the database) causing not only unnecessary reads, but also writes to the same database without adding other business value for the application. In general, be aware of trigger actions that cause hits to the database including foreach, First, Any, Sum, Count that are not part of a LINQ projection as they will frequently cause the database to be hit. Even returning a LINQ query as an IEnumberable result and trying to add additional filtering on that causes any part of the request after the IEnumerable to be run in memory rather than on the database. It can also result in the query being issued multiple times if you foreach over the IEnumerable multiple times.
Even when paying attention, you do have to know a bit about database tuning and indexing and the impact of your queries on indexes. For example, if you are using the .Include() method to fetch child records associated with the parent, that will include all of the child columns as part of the request and thus will ignore covering indexes that your DBA so carefully created and thus perform sub-optimally.
More subtle issues can arise when developers try to apply date calculations in Where clauses or mismatch ANSI strings with varchar data types. In both cases the indexing that you thought you were using may be unavailable and your stuck with a slow table scan as a result.
In all these cases, paying attention, identifying the poorly performing queries, and profiling them to determine what changes you need to make is the first step toward ensuring a well performing application. Oh, yeah, that and not using EF as an ETL tool to bulk insert or update records.
How do performance monitoring and tuning work together to improve things?
Having a test suite that you can use to monitor performance is the first step towards paying attention. With that you should be able to identify queries or sections of code that are performing poorly that need to be tuned. Tuning without monitoring often results in pre-mature optimizations. Even with EF in .Net Framework, while the queries generated by EF may have looked quite nasty, the execution plans generated by the database may have generated as good or in some cases better queries than you could yourself.
Only after identifying problematic queries can you look to tune, or in some cases re-think the task to achieve the same resulting goal. Performance tuning of applications is very much an iterative art and there's rarely a one-size-fits-all approach that will solve it. You have to try multiple strategies to figure out what works best for the case at hand but also recognize that shipping the application is a feature. Once you've found a solution that's good enough for your SLA, it may be time to move on to the next issue or add more business value.
Profiling database interactions through EF: What's involved and how does it help?
A number of free and commercial profiling tools exist depending on your data platform and environment. These range from SQL Profiler (or the corresponding extension to Azure Data Studio), Intellitrace, and Application Insights, to third-party plugins like EFProf, OrmProfiler and MiniProfiler. EF also includes hooks to add logging functions that can be pushed to your logging platform of choice.
In some cases, the tooling support can be added without modifying your application code, while others require small hooks be placed into the application to wire up the interception. Additionally, some of the later tools add support for more easily finding the line of code causing the request. This can be invaluable in trying to find our where a pesky poorly performing query originated from. The earlier tools will only trace the SQL statements issued against the database. In those cases, it's up to the developer to try to determine where the request was issued in code.
What are a couple of pros and cons of working without stored procedures?
Many proponents of stored procedures tout the ability that they bring in terms of query compilation and caching, and parameterization to avoid SQL injection security considerations. I've seen many customers who have built dynamic queries inside of their stored procedures that completely eliminate those benefits. EF on the other hand generates queries that can be cached and are guaranteed to be parameterized (unless you use FromSqlRaw). As a result, EF usage can be just as performant and secure (from an injection perspective) as stored procedures.
Additionally, by not forcing a dependence on stored procedures, you gain the benefit of being able to swap database platforms by changing out the provider rather than having to re-write all of your queries. There's less code to have to maintain as well. For example, consider when adding a single column to a table. With EF, you only add a single property to an entity model. Stored procedures would require similar application changes, but also require table schema change management and multiple stored procedure changes (for the CRUD operations). In general, the less amount of code that you write, the easier the system is to maintain.
Sometimes using stored procedures is necessary. EF and in particular LINQ are abstractions over common data functionality. Some platforms offer specific features that LINQ may not support (query hints, hierarchy, geospatial, bulk copy, and so on) and may require custom code. Newer versions of EF have closed the gap on some of the provider-specific structures, but some gaps remain. If you follow the 80/20 rule in using EF where possible and drop down to stored procedures or other platform specific functions where necessary, you can often land with an approach that works well for the application scenarios.
In your consulting experience, what are some common pain points organizations or teams experience in getting started with EF?
From a consulting perspective, the issue most often isn't a technology one. It's a people problem. For new applications, adopting new technologies is often easier. For existing applications that have built up over years, often it's a case of trusting that a general-purpose framework can be as good (or better) than one that has been home grown. Some of these applications were purpose built to pass around ADO structures like DataSets and DataTables all the way to the UI. Trying to make an ROI justification for modernizing the data layer can often be difficult in these circumstances. Often I find advocating for using EF in new modules or service layers to be an easier lift than trying to retrofit an existing application.
The other typical pain point I see also comes down to trust between the engineering teams and support teams (often developers vs. database administrators). The support teams are the ones called at 3 a.m. when the system is crashing due to a poorly performing query that is hogging the database server's CPU. The argument here is that if the query was abstracted into a stored procedure, they would have control over modifying it rather than having to get the developers to make code changes. This could be mitigated by having performance/integration test suites with the appropriate coverage and reviewing of the generated queries through the development cycles in lower environments. This approach requires communication across the teams to establish better trust over time.
About the Author
David Ramel is an editor and writer at Converge 360.