Practical .NET

Reduce Overhead When Retrieving Objects with Entity Framework

If you're only retrieving an object so that you can read its data, you can reduce your costs by turning off Entity Framework tracking. Which is exactly what you want to do in ASP.NET MVC, as long as you're careful.

I do love Entity Framework. But, because Entity Framework is a whacking big wrapper around ADO.NET, I’m also interested in reducing the overhead around using it. One option for reducing Entity Framework overhead is the AsNoTracking option, which turns off the Entity Framework change-tracking functionality for the objects you retrieve. The performance gains can be considerable, especially as the number of objects you’re retrieving increases (though you should be aware of the caveats at the end of this column).

The side effect of using AsNoTracking is that if you make changes to your entity objects and then call SaveChanges, no updates are issued to the database. Therefore, turning off tracking makes the most sense in any scenario where you retrieve data, send it to a client for processing and then, at some later point, get the data back for updating. Which is, you’ll notice, the way the typical ASP.NET MVC application works.

ASP.NET MVC Processing without Tracking
As I’ve discussed in earlier columns, a natural design pattern for ASP.NET MVC developers is the Command Query Responsibility Separation pattern (CQRS). In this pattern, you use one plan to retrieve data (often unstructured) and another plan to perform updates (usually organized around specific entities).

Therefore, turning off tracking makes a lot of sense in the Query portion of your ASP.NET MVC application. It’s only when the data comes back from the browser (or any other client) that you actually do your updating (the Command portion) and may need to have tracking turned on. Even then, as I’ve suggested in columns on improving the performance your data updates, you may not need to retrieve any objects in order to update, add or delete rows in your database. That’s a big deal because, if you’re not retrieving objects, tracking is irrelevant.

To retrieve data without tracking, you just call the AsNoTracking method from one of the collections from which you’re retrieving data. This code uses it when retrieving all the Customers in Canada:

Dim db As New OrdersEntities()
Dim cust As Customer 
cust = From c in db.Customers.AsNoTracking().Include("Orders")
Where c.Country = "Canada"
Select c

As you can see in this query, I’ve called the AsNoTracking method from my Customers collection. I could have just as easily called it from the Include method -- it doesn’t make any difference.

In the method-based syntax, the code would look like this:

cust = db.Customers.AsNoTracking.Include("SalesOrders").Where(Function(c) c.Country = "Canada")

Updates Without Tracking
When it comes to the update portion of your application, there’s nothing stopping you, of course, from turning off tracking when retrieving objects. However, the benefits of turning tracking off really increase with the number of objects retrieved and, during updates, you’re typically retrieving single objects. There might not be much to be gained here.

If you do want to turn off tracking during updates, you need to change your code … but not by much. For example, if you turn off tracking when retrieving objects for update, you’ll need to add a line of code to set the changed object’s State property before calling SaveChanges. Typical update code would look like this:

Dim cust = From c In db.Customers
           Where c.CustId = 5
           Select c
'...update any properties... .
db.Entry(cust).State = Entity.EntityState.Modified
db.SaveChanges()

If you want to Delete an object retrieved with tracking turned off, you’ll need to use similar code to set the object’s State property to EntityState.Deleted, like this:

Dim cust = From c In db.Customers
           Where c.CustId = 5
           Select c
db.Entry(cust).State = Entity.EntityState.Deleted
db.SaveChanges()

However, you can’t really call that line of code "additional" because it just replaces the code where you’d call some collection’s Remove method.

For inserts, all you need to do is set the object’s State property to EntityState.Added … unless your object has a relationship with some other entity object. With tracking turned off, it’s your responsibility to maintain those relationships by setting the relevant properties.

For example, if I create a SalesOrder, I’ll want to attach that SalesOrder to the SalesOrders collection of a Customer. With tracking turned off, I must set the CustomerId property on the SalesOrder object to the appropriate Customer object. Code to add a SalesOrder and attach it to the customer with a CustId of 5 would look like this:

Dim soNew As New SalesOrder
soNew.CustomerId = 5
//...update any other required properties...
db.Entry(soNew).State = Entity.EntityState.Added
db.SalesOrders.Add(soNew)
db.SaveChanges()

Of course, this means that you don’t need to retrieve the relevant Customer object at all, so this is really the code to use if you want to avoid retrieving objects during updates.

Again, it’s hard to call this "additional" code. My alternative would be, first, to add the code to retrieve the Customer object (not only more code, but an additional trip to the database). Then, instead of the two lines that set the State property and add my SalesOrder to the SalesOrders collection of the DbContext object, I’d have a single line that adds the SalesOrder to the Customer’s SalesOrders collection. "Net-net" (as people say) it’s probably the same amount of code.

Caveats
But this does lead to my first caveat. If I fail to set the properties that tie the objects together, I will get an error when I call SaveChanges … but only if the related column in the database is set to NOT NULL (that is, the relationship is required). This is a good thing because it lets me know that I’ve done something wrong. If, unfortunately, the related column accepts NULLs (that is, if the relationship is optional), I won’t get an error and I may not discover my problem for some time.

A more important caveat is that with tracking turned off, there is a potential for much, much worse performance. This is possible if you’re retrieving child objects shared by multiple parent objects.

For example, while I have many SalesOrders, I probably have very few shipping options and, as a result, each ShippingOption object is shared among many SalesOrders. With tracking turned on, if I retrieve two SalesOrder objects that share a single ShippingOption object, I’ll retrieve only a single ShippingOption object; with tracking turned off, I’ll retrieve two identical copies of the same ShippingOption. In the first scenario, the two SalesOrders share a single ShippingOption; in the second scenario, each SalesOrders gets its own copy of the ShippingOption.

From a performance point of view, in an online transactional system when retrieving objects, I probably don’t care: The overhead in retrieving the multiple copies of the same ShippingObject is probably negligible for the relatively few number of objects I retrieve. Basically, I’m betting that the time I save by turning off tracking will be more than the time I lose by retrieving some extra objects (assuming they exist). Fortunately, this is easy to test for: I just add AsNoTracking to my query and run my application. If things get worse, I can just back out the call to AsNoTracking.

On the other hand, if I’m making changes to those shared objects with tracking turned off, then I do care very much about this effect. Imagine, for example, that I retrieve two SalesOrders from the same customer. With AsNoTracking turned off, I have two copies of the same Customer object. If I make changes to each of the Customer objects and call SaveChanges, then the changes in one copy of the Customer object can easily overwrite the changes I made to the other copy. The good news here is that, if you’ve added the code to handle two different users retrieving the same Customer, you’ll get an error.

However, it might be safer to just leave tracking on when retrieving data for updates and gather the benefits of using AsNoTracking only with those queries retrieving data for display. Certainly, when updating, my first choice is to avoid retrieving objects at all, which avoids worrying about tracking altogether. When I do have to retrieve data to support an update, I leave tracking on. My life is complicated enough.

About the Author

Peter Vogel is a system architect and principal in PH&V Information Services. PH&V provides full-stack consulting from UX design through object modeling to database design. Peter tweets about his VSM columns with the hashtag #vogelarticles. His blog posts on user experience design can be found at http://blog.learningtree.com/tag/ui/.

comments powered by Disqus

Featured

Subscribe on YouTube