Practical .NET

Speeding Up Queries: How to Think About Query Plans

Often, designing databases and speeding up SQL Server queries are treated as a kind of abstract problem: A just "Find this problem, apply this solution" approach to problem solving. In practice, picking the right solution requires having a deep understanding of how your application and its users actually interact with your data. A generic solution that's applied blindly can often just create new and more exciting problems. A good example of this is in looking for solutions to manage query plans.

Planning Queries
It's worth recognizing that many SQL statements aren't, in fact, planned in any meaningful sense. One of the first things that the SQL Server optimizer does when it sees a query is determine if a query is "trivial." If a query is deemed trivial, the optimizer skips most of the planning process.

However, if a query is planned, the optimizer recognizes that it would be silly to plan it again. This is a good thing because, in our applications, the same queries do keep coming around again and again. So, once a plan is created, SQL Server flags it with a couple of identifiers that tie the plan to that query (or one very much like it) and puts that plan in a cache. When that query (or one very much like it) shows up again, then the query optimizer can skip the planning process, as it does with trivial plans.

By the way, this is worth remembering when someone tells you that a major benefit of using stored procedures is that they don't require any planning. In fact, SQL Server tries to plan only queries that aren't in stored procedures if they're (a) worth planning and (b) are running for the first time. Over time, as plans build up in the cache, only a small number of queries will require planning (ideally, less than 15 percent).

This is obviously a good thing ... but at least two things can go wrong.

Unfortunate Parameters
For example, imagine the index on the Gender column of a table listing all valid drivers in Saudi Arabia. It's only recently that women have been allowed to drive in Saudi Arabi, so the number of women in the column is going to be very small. An index on the Gender column is, obviously, a great index to use to find all the women drivers -- that index will reduce the number of rows to process to some very small number. Equally obviously, an index on the Gender column would be a terrible index to use to find male drivers because it would return virtually the whole table -- SQL Server would be better off just to scan the whole table.

But let's say that the first query used against the table is one to find all the women drivers. A plan is generated and added to the cache ... and will be used the next time the query is executed. If this query is usually used to find women drivers, this isn't a problem. If, however, the query is used later to find the male drivers, then only the value used against the Gender column will be changed. As a result, the plan for the women drivers will probably be used again ... and execution time is going to suffer.

Much, therefore, depends on whether the query used to generate the plan uses "typical" parameters and whether the query that generates the plan is a "typical" example of the query. After all, if you don't search for male drivers very often, then it doesn't matter if that query runs slowly, provided the "typical" query for female drivers runs quickly.

Understanding Your Data
There are three ways to address this problem, but picking the right one depends on how your application uses the data.

First, do nothing. If you assume the first version of the query seen by the optimizer is the most common version of the query, then the query plan that's generated will give the best results for the most "typical" query. It doesn't matter what value that first query uses, it's going to give the best result for your most common queries. Queries using "atypical" parameters will run slowly, but life's just like that.

Second, use Optimize For to specify the parameter value to be used when planning the query. Optimize For lets you pick a value that represents the "typical" query so that the plan will be optimized for the most common runs. This makes sense if the "atypical" values are used frequently enough that they may be the version used to generate the plan (but aren't used frequently enough to be allowed to control the plan).

Third, use With Recompile with your query. This causes the query to be re-planned each time it runs. The assumption here is that the cost of re-planning is less than the cost of running with an "atypical" parameter. This makes sense if queries with "atypical" parameters crop up frequently enough that you can't afford to let them run slowly.

My point here is that there isn't an abstract "right" answer. It's only by understanding both which parameter values aren't "typical" and how "typical" each query is that you can make the right decisions. In other words, knowing your users is every bit as important as knowing your technology.

About the Author

Peter Vogel is a system architect and principal in PH&V Information Services. PH&V provides full-stack consulting from UX design through object modeling to database design. Peter tweets about his VSM columns with the hashtag #vogelarticles. His blog posts on user experience design can be found at http://blog.learningtree.com/tag/ui/.

comments powered by Disqus

Featured

  • Mastering Blazor Authentication and Authorization

    At the Visual Studio Live! @ Microsoft HQ developer conference set for August, Rockford Lhotka will explain the ins and outs of authentication across Blazor Server, WebAssembly, and .NET MAUI Hybrid apps, and show how to use identity and claims to customize application behavior through fine-grained authorization.

  • Linear Support Vector Regression from Scratch Using C# with Evolutionary Training

    Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the linear support vector regression (linear SVR) technique, where the goal is to predict a single numeric value. A linear SVR model uses an unusual error/loss function and cannot be trained using standard simple techniques, and so evolutionary optimization training is used.

  • Low-Code Report Says AI Will Enhance, Not Replace DIY Dev Tools

    Along with replacing software developers and possibly killing humanity, advanced AI is seen by many as a death knell for the do-it-yourself, low-code/no-code tooling industry, but a new report belies that notion.

  • Vibe Coding with Latest Visual Studio Preview

    Microsoft's latest Visual Studio preview facilitates "vibe coding," where developers mainly use GitHub Copilot AI to do all the programming in accordance with spoken or typed instructions.

  • Steve Sanderson Previews AI App Dev: Small Models, Agents and a Blazor Voice Assistant

    Blazor creator Steve Sanderson presented a keynote at the recent NDC London 2025 conference where he previewed the future of .NET application development with smaller AI models and autonomous agents, along with showcasing a new Blazor voice assistant project demonstrating cutting-edge functionality.

Subscribe on YouTube