Practical .NET

Best Practices for Lazy Loading in Entity Framework

Sometimes you want child objects retrieved with the parent object, and sometimes you don't. What you NEVER want is to retrieve child objects accidentally. Here's Peter's advice on how to get the best performance when loading child objects.

When writing an entity class to use in Entity Framework, you can include navigation properties that replicate the primary-foreign key relationships in your database (also known as parent/child or master/detail relationships). You can write those relationships one of two ways. With the virtual/Overridable keyword:

Public Class Customer
  Public Overridable Property Orders As ICollection(Of SalesOrder)
End Class

or without:

Public Class Customer
  Public Property Orders As ICollection(Of SalesOrder)
End Class

The difference is that the first version enables lazy loading . . . which is probably what you don't want. The second version, which disables lazy loading, is, in my opinion, preferable.

Lazy Loading vs. Eager Loading
Entity Framework defaults to lazy loading and allows you to override it when necessary. That is, I think, a good thing -- but not enough of a good thing to qualify as the "best." Fundamentally, lazy loading means that the child objects at the end of a navigation property aren't retrieved unless you explicitly work with the navigation property in your code. If you don't need those child objects, then this is the behavior that you want because it reduces the amount of data retrieved from the database when you retrieve the parent object.

As an example, this code doesn't retrieve the Orders for a customer until the user touches the Orders property inside the loop, potentially generating a new trip to the database server for each order, which is a bad thing:

Dim custs = From cust From db.Customers
	    Select cust
For Each c In Custs
  If Not c.Valid Then
    For Each o in C.Orders
      ...process order
    Next
  End If
Next

If you want the child objects for every parent object then you don't want lazy loading -- you want to retrieve child objects when you retrieve the parent object. That's easy to do: Just use the Include method in your LINQ query to turn on "eager loading." Eager loading causes Entity Framework to retrieve the parents and child objects with one trip to the database server. This reduces the number of trips (at the cost of retrieving more data, of course). This code retrieves the Orders with the Customer objects by using the Include keyword:

Dim custs = From cust From db.Customers.Include(Function(c) c.Orders)
            Select cust
For Each c In Custs
   For Each o in C.Orders
      ...process order
   Next
Next

Because this code processes the Orders collection for every Customer object, you'll eventually need all the data and retrieving in one trip with eager loading because it gives you the best performance.

When Lazy Loading Makes Sense
To be fair, even if you do need the child objects, there are scenarios when you can get better performance without using the Include method. These scenarios occur when you need child objects only for some of the parent objects you're retrieving. In that case, lazy loading can make sense. Yes, lazy loading will generate a new trip to the database server to retrieve the child objects you want … and that's too bad. However, lazy loading will avoid retrieving child objects you don't need when you retrieve the parents.

This code, which retrieves Orders only for some Customers, could be a scenario where lazy loading will give you the best performance -- because of that, I've omitted the Include method:

Dim custs = From cust From db.Customers 
            Select cust
For Each c In Custs
  If Not c.Valid Then
    For Each o in C.Orders
      ...process order
    Next
  End If
Next

Obviously, there's a balance point here where the percentage of parent objects whose children you need is high enough that eager loading (more data) will give you better performance than lazy loading (more trips). The only guaranteed way to determine that point is through load testing (and, I'd suggest, if you can't measure the difference with a wrist watch then there is no difference). My experience has been that it takes a surprisingly small percentage of parents-with-required-children to make eager loading the best choice.

The Best Practice
This sounds like you should leave lazy loading on by default and invoke it when necessary through the Include method, which is the Entity Framework default.

I don't think that's "good enough" because, with lazy loading, you may get child objects even if you don't need them. For example, if you attempt to serialize a parent object, the serialization process will touch each of the properties on the parent object, retrieving all the "lazy loadable" children. This gives you the worst of all possible worlds: All of the data and doing it with multiple trips to the database.

In my opinion, your best choice is to turn off lazy loading by default and enable it selectively only in those entities used in the (rare) scenarios where lazy loading makes sense. Most of the time, when you want the child objects, you'll use the Include method. If necessary, even with lazy loading disabled, you can use the Load method to retrieve child objects.

Therefore, the best practice is to not use the virtual/Overridable keyword when defining navigation properties in an entity. In other words, this entity class code is, in my opinion, preferable because it doesn't support lazy loading:

Public Class Customer
  Public Property Orders As ICollection(Of SalesOrder)
End Class

When you do need the Orders objects you can add the Include method to your LINQ query. For those entities used in scenarios where lazy loading makes sense you can enable it by defining the navigation parties that need it using the Overridable/virtual keyword. And, if worse comes to worst, even with lazy loading disabled, you can still selectively load the child objects without using the Include method. Just use the Load method:

Dim custs = From cust From db.Customers
            Select cust
For Each c In Custs
  If Not c.Valid Then
    db.Entry(c).Reference(Function(c) c.Orders).Load()
    For Each o in C.Orders
      ...process order
    Next
  End If
Next

However, this code is going to look sufficiently bizarre to the next programmer, so I'd only use it if I can't add the virtual/Overridable keyword to the navigation property in the entity class.

You can also turn off lazy loading for all of the objects in your DbContext by setting the LazyLoadingEnabled property on the DbContext's Configuration property. You do that in the DbContext object's constructor, as in this example:

Public Class CustomerOrdersContext 
  Inherits DbContext 

  Public Sub New()
    Configuration.LazyLoadingEnabled = False
  End Sub
  Public Property SalesOrders As DbSet(Of SalesOrder)
End Class

Personally, I think that turning off lazy loading at the DbContext list is too aggressive because of those scenarios when lazy loading makes sense. I prefer turning lazy loading off on an entity-by-entity basis because it gives me the flexibility to turn it back on when I need it. That I'm a fundamentally lazy person and my preferred solution requires the least typing is just a happy accident.

About the Author

Peter Vogel is a system architect and principal in PH&V Information Services. PH&V provides full-stack consulting from UX design through object modeling to database design. Peter tweets about his VSM columns with the hashtag #vogelarticles. His blog posts on user experience design can be found at http://blog.learningtree.com/tag/ui/.

comments powered by Disqus

Featured

Subscribe on YouTube