In-Depth

Five Tips to Improve LINQ to SQL Performance

LINQ to SQL is a powerful technology that can do as much harm as good if it is mis-used. Here is how to get more out of your LINQ to SQL efforts.

Remember the old adage, "The greatest strength of the C language is also its greatest weakness?" It sounds like a paradox, but is absolutely true. The low-level nature of C would let you bend the rules of space and time to write a program, but if you asked to shoot yourself in the foot, it would also gladly hand you a gun with a smile and step out of the way.

LINQ to SQL is much the same way. It's a powerful tool but amazingly simple to set up and use. Like C, however, its greatest strength -- abstracting out the complexities of SQL -- can also be its greatest weakness. It's very easy to write database access code with LINQ to SQL. It's just as easy to write bad database access code with LINQ to SQL that doesn't perform well. In this article, I'll offer five tips to help you avoid accidentally asking LINQ to SQL to shoot you in the foot.

Tip #1: Ditch the Extra Baggage with ObjectTrackingEnabled
I'm a hiker, so I like hiking analogies.

In Colorado we have 54 mountains that exceed 14,000 feet in elevation. We call them "14ers."

When I'm preparing for a 14er hike, I pack a lot more gear than I would for a short hike within a few minutes of my house. The upside of all that extra packing is I'm prepared for anything that comes up. The downside, however, is that I'm carrying extra weight around on my hike, which slows me down a little and takes its toll on my back and shoulders. On the other hand, if I'm taking a short hike, I'll probably just grab the dog and go. No heavy pack to carry.

LINQ to SQL, by default, behaves like it's going on a 14er hike. It carries every provision it thinks it'll ever need, and your application carries that extra weight. If you know you're only going for a short hike, why not tell it to leave the pack at home?

That's where the ObjectTrackingEnabled property of the DataContext comes in. By default this property is true, which causes LINQ to SQL to keep track of every change you make to your data in case you want to save those changes later. If you know you're only reading data and won't be changing it, I suggest setting this property to false. Then all of that change-tracking overhead goes away.

The performance boost does come at a small price, however. Deferred loading will no longer work. You'll have to use LoadWith or manually extract related data yourself using projections.

Tip #2: Slim Down Your Queries with Projections
Let's say I have a Customer table with 20 fields, but I'm only interested in three: FirstName, LastName, and Email. What do you suppose LINQ to SQL will do if I execute this code?

var customer =
	(from cust in dataContext.Customers
	 where cust.ID == customerID
	 select cust).Single();
var customerLite = new {
	customer.FirstName,
	customer.LastName,
	customer.Email
};

The answer is it'll happily issue a 'SELECT' statement asking for all 20 fields in the Customer table so it can populate every field of the Customer entity it gives me back (even though I only care about three fields). That seems a bit wasteful, don't you agree?

Now suppose I re-wrote the code to look like this:

var customerLite =
	(from cust in dataContext.Customers
	 where cust.ID == customerID
	 select new {
		customer.FirstName,
		customer.LastName,
		customer.Email
	 }).Single();

This time LINQ to SQL will issue a 'SELECT' statement asking only for the three fields I care about because I gave it enough information before it executed the query to know specifically what I wanted.

The technique I used in the second example is called projecting (or shaping) the data. Using an anonymous type, I tell LINQ to SQL exactly what I'm interested in, and that's all it asks for. For a single customer record this likely won't make a big difference, but if I'm retrieving a list of thousands or millions of customer records, the performance gain will be significant.

One real-world scenario where I've found this technique helpful is in returning data from a WCF service. I often don't want to return entire entities or an entire object graph from my service calls, so I project the data from LINQ to SQL into Data Transfer Objects (DTOs), which I send across the wire instead. These DTOs contain only the data I need and improve the performance of my service by cutting down on the payload I'm sending across the wire.

The only downside I've found with using projections is losing the ability to use the LINQ-generated entity classes (like the Customer class for customer records). This means I lose some entity-dependent features like deferred loading of child entities and state tracking. Since I tend to use projections mostly when I'm pulling lots of records for display or analysis purposes (and not for updating data), I'm generally okay with that tradeoff.

Tip #3: Optimize Your Optimistic Concurrency Checking
It's time to invoke the hiking analogy again. Concurrency checking is another case where LINQ to SQL will automatically prepare for a 14er hike unless you tell it you're just doing a 15-minute nature walk.

Unless we tell it not to, LINQ to SQL will help us ensure that multiple users working with the same record don't overwrite each other's changes. It does this by enabling optimistic concurrency checking on all database updates by default and alerting us if it detects we're trying to update a record that has changed since we last retrieved it.

To see how LINQ to SQL does this, take a look at the entities LINQ to SQL generates for you. You'll see that each property on an entity has an UpdateCheck attribute, which has possible values of Always, Never, and WhenChanged.

By default, UpdateCheck is set to Always, which means LINQ to SQL will check the value you're saving for that property against what's in the database. (The checking comes in the form of extra qualifiers added to the 'WHERE' clause of your update query.) If they're different, LINQ to SQL assumes the record has changed since you last retrieved it and throws a concurrency exception.

Unless you need optimistic concurrency checking in your application, I recommend setting UpdateCheck to Never for the properties on your entities. This will greatly speed up your queries by eliminating all the extra checking that is done during updates. For many applications, this "last write wins" type of approach is perfectly acceptable.

If, however, you do need optimistic concurrency checking in your application, then I suggest adding a column of type timestamp to each of your tables in SQL Server that you want checked. I often do this by adding a column called "RowVersion" to my tables. Then when LINQ to SQL generates an entity class from that table, it will automatically set the UpdateCheck property to Never for all properties in the entity and use the timestamp column in the table for concurrency checking instead.

Tip #4: Keep the Number of Parameters Down
LINQ to SQL always uses parameterized queries to prevent SQL injection attacks. For example, suppose I have a Product table I wish to query for a single product.

My LINQ query might look like this:

	var query =
		from p in dataContext.Products
		where p.ProductID = 1
		select p;

And LINQ to SQL, in turn, will generate this:

SELECT [t0].[ProductID], [t0].[ProductName], [t0].[CategoryID]
FROM [Products] AS [t0]
WHERE [t0].[ProductID] = @p0

It added a variable called "@p0" which will be assigned a value of 1 when the query is run.

Now obviously this is a contrived example, and a simple query like this with a single parameter will run pretty quickly. However, I've noticed through real-world testing that the greater the number of parameters, the slower LINQ to SQL performs. This problem is especially likely to manifest itself if your LINQ query uses the Contains() method, which LINQ to SQL translates to a 'WHERE IN' clause that takes the form of 'WHERE IN (@p0, @p1, @p2, ... ).'

If you have a query that's performing slowly, check the number of parameters being generated by LINQ to SQL. Try to limit the number of parameters to around a dozen or so at most. There's no magic number, but I've seen noticeable delays in my applications as soon as I start getting a little over a dozen parameters or so in my queries.

Tip #5: Debug and Optimize Your Queries
If you're like me, the main reason you do a search on the Internet for "optimize LINQ to SQL" is because you've got a slow-performing query you want to speed up, not because you're just bored and are eager to read about the intricacies of LINQ to SQL. You may also have a user or tester on your back asking you to get it fixed because it's noticeably affecting the responsiveness of your application. If so, I can sympathize. I've been there, and I know how you feel.

As helpful as the first four tips in this article are at improving performance, there are times when the query itself is the problem, and the best way to speed it up is to re-write it or apply other standard SQL optimization techniques (like adding indexes). Of course, to know which approach to take, you need to analyze your query and find out where the "slow points" are.

If you've optimized SQL queries in SQL Server before, you've probably used SQL Server Management Studio to view the execution plan for a query and then used the plan (combined with suggestions by SQL Server) to optimize your query. The process when using LINQ to SQL is basically the same except for two extra steps. First, you must get the SQL code LINQ to SQL is generating for your query so you can analyze it. Second, if re-writing the query is required, you must coax LINQ to SQL into generating the SQL syntax you're after.

For the first step -- getting the SQL code that LINQ to SQL is generating -- you have two options. The first is to hook into LINQ to SQL's log writer (there are numerous articles on the Internet explaining how to do this) and look at the SQL code it writes to the log. The second option -- and the one I generally prefer -- is to use a free tool called LINQPad available at www.linqpad.net.

LINQPad can be thought of as a "scratch pad" for writing and testing LINQ to SQL queries (and for that matter, any type of LINQ query -- LINQ to Objects, LINQ to XML, etc.). In the simplest use case for LINQ to SQL, you create a database connection for your database and start writing LINQ to SQL expressions against it in the editor. LINQPad will automatically create the DataContext and entity classes for you behind the scenes, so none of that work is required. However, you can configure a database connection to use a custom DataContext (like one from your application) if you wish.

LINQPad also comes with tons of readily executable sample queries and some helpful utility methods -- like the Dump() method -- that are immensely helpful when debugging and optimizing queries.

As a simple example, I could write the following LINQ to SQL code in LINQPad:

var ordersByCountry =
	from o in Orders
	group o by o.ShipCountry;

var query =
	from o in ordersByCountry.Dump()
	select new
	{
		Country = o.Key,
		Orders = o
	};

query.Dump();

The Dump() method outputs the contents of whatever it's given to the Results window in LINQPad and then returns the same object, allowing me to view the output of intermediate steps in my expressions.

Also, any time you execute a LINQ to SQL query in LINQPad, the SQL code generated by LINQ is written to the SQL window and is ready to be copied and pasted into SQL Server for analysis, parameter declarations and all. Then you can analyze the query in SQL Server like you normally would and figure out what optimizations are required.

Once you know what needs to be done, then comes the next step: convincing LINQ to SQL to see things your way and generate the code you want. This process could go pretty quickly, or it might be more like convincing your wife it's a good idea for you to go to the pub with your buddies on Friday night rather than spending time at home. The good news is that regardless of how it goes, you can test variations of your LINQ code in LINQPad and look at the SQL that's generated until you're getting the exact results you want. LINQPad even includes the query execution time in the status bar so you can evaluate how your queries perform.

I can't stress enough what a wonderful tool LINQPad is for working with LINQ to SQL. Do yourself a favor and start using it to analyze and improve your queries.

Conclusion
Let's face it. LINQ to SQL is a powerful tool, and there's way too much information about it for me to squeeze into one article. However, I hope the tips I presented here are helpful and provide a good starting point when you're trying to optimize your queries.

Oh, and drop me a line if you're in Colorado and are planning to hike a 14er or head to the pub. I might like to tag along.

About the Author

Bart McDonough is the Principal Consultant at Incline Technical Group, a Microsoft consulting firm specializing in custom software development and training. He's been developing software for over ten years and works across the whole Microsoft technology stack, specializing in SharePoint, Silverlight and ASP.NET MVC solutions. He attends and speaks at .NET and SharePoint User Groups and was active in beta testing Silverlight 3 and RIA Services. He spends his free time with his wife and three kids and takes full advantage of the hiking, skiing, and world-class brewpubs available to him in Colorado. He can be contacted via www.inclinetechnical.com and doseofdotnet.wordpress.com.

comments powered by Disqus

Featured

Subscribe on YouTube