Efficiently Loading Related Objects in Entity Framework
If you're not using the Entity Framework DbCollectionEntry object when working with an entity class's related objects, then your application is running too slow. Using DbCollectionEntry lets you asynchronously retrieve related objects and get only the objects you want.
In an earlier column on lazy loading in Entity Framework (EF), I implicitly endorsed "explicit loading" as a preferable alternative to lazy loading. With explicit loading, when you want to retrieve the objects at the end of a navigation property, you should do three things:
- Retrieve the object with the navigation property
- Check to see if the navigation property is loaded
- Issue a line of code to retrieve the related objects if they're not there
That code might look something like this:
Dim custs = (From c In db.Customers
For Each cust As Customer In custs
If Not db.Entry(cust).Collection(Function(c) c.SalesOrders).IsLoaded Then
(For a discussion of why I'm using a ToList here, see my tip from earlier this month.)
If you compare this code to what I used in that earlier column, you'll see that I've switched to using the Collection method (my earlier code used a method called Reference). I did that for two reasons. One reason is that using Collection is compatible with EF Core 1.1 and using Reference is not. This might not be something that you worry about.
The other reason I switched is that this code gives me access to the EF DbCollectionEntry object which is a change worth making: I can use DbCollection to speed up my application whenever I use explicit loading. Before you get excited about that, though, let me put in a caveat: If you are going to work with one entity class's related objects (those objects at the end of a navigation property), then your first choice should be to use the Include keyword/method when retrieving your objects (I discussed why in that article on lazy loading).
But if I'm retrieving Customer objects and only occasionally using the objects in the SalesOrders collection, then I might want to use explicit loading. And if I'm using explicit loading, I want to use the DbCollectionEntry object.
The first reason to use the DbCollectionEntry object is that it lets me load the collection asynchronously.
To do that, I first have to get to the DbCollectionEntry object. That's easy: I just need to catch the object returned by the Collection method in my earlier code. I can also catch the DbEntityEntry object that returned by the Entry method. If I catch both my objects, I can simplify my earlier code to something like this:
Dim custEntity As DbEntityEntry(Of Customer)
custEntity = db.Entry(cust)
Dim salesOrderEntity As DbCollectionEntry
salesOrderEntity = custEntity.Collection(Function(c) c.SalesOrders)
If Not salesOrdersEntity.IsLoaded Then
But now that I have the dbCollectionEntry object, I probably don't want to use the Load method. Instead, I can improve the responsiveness of my application by using the LoadAsync method to have my collection objects loaded in parallel. This does assume that I'm willing to wait a while before using the SalesOrder objects, though.
My first step in this process is to call the LoadAsync method as soon as I can determine that I'll need the SalesOrders. In that code, I'll also catch the Task object that LoadAsync returns. I would use this code as soon as I retrieve a Customer where I intend to work with the SalesOrders collection:
If cust.IsValid Then
Dim asyncOrders As Task
asyncOrders = Nothing
If Not salesOrdersEntity.IsLoaded Then
asyncOrders = custEntity.Collection(Function(c) c.SalesOrders).LoadAsync()
'...work with the Customer object...
As that final comment suggests, now that I've kicked off the process of retrieving the SalesOrders, I'll continue to work with my Customer object while the SalesOrders collection is loading asynchronously.
When I run out of things to do with the Customer object and have to start working with the SalesOrders collection, I need to do three things:
- Check if I needed to load the SalesOrders collection. If my asyncOrders Task object is set to Nothing, then I didn't call LoadAsync (presumably because the SalesOrder collection was already loaded)
- If asyncOrders is not Nothing, I need to check the Task's IsCompleted property to find out if the SalesOrders collection has finished loading (technically, I should also check the IsFaulted and IsCancelled properties)
- If asyncOrders is not Nothing and not completed, then I need to wait for the collection to load. I can do that by calling the Task object's Wait method (optionally, I can pass the Wait method a timeout value if I'm worried that the collection might take "too long" to load)
Putting all that together, I insert this code just before the code what works with the SalesOrders collection:
If asyncOrders <> Nothing AndAlso
Not asyncOrders.IsCompleted Then
'...work with SalesOrders...
Loading Just the Objects You Want
While the LoadAsync method will make my application more responsive, it won't reduce the amount of work being done. One way to reduce the work being done (and, as a result, speed up my application) is to retrieve only the objects in the collection that I need. I tackled this problem earlier using the LINQ SQL-like syntax. However, once you have access to the DbCollectionEntry object, you have another option: the Query method.
The DbCollectionEntry object's Query method returns the LINQ query that's used to retrieve the items in the collection. Like any other LINQ query, you can modify this query by calling an extension method from it. For example, if you only want the First object in the collection, you could write this code:
so = custEntity.Collection(Function(c) c.SalesOrders).
A more likely scenario is that you want to retrieve some, but not all, SalesOrders. This code retrieves just the SalesOrders with a shipping date in the future:
var sos = custEntity.Collection(Function(c) c.SalesOrders).
Where(s => s.ShipDate > DateTime.Now)
Sometimes you don't need the whole SalesOrder. This query would return the date of the SalesOrder with the latest ShipDate:
decimal MaxValue = custEntity.Collection(Function(c) c.SalesOrders).
Max(s => s.ShipDate)
Of course, if you want the SalesOrder with the latest ShipDate, you can just sort the SalesOrders in descending order by ShipDate and take the first SalesOrder. It's just a matter of stringing together more extension methods:
so = custEntity.Collection(c => c.SalesOrders).
OrderByDescending(s => s.ShipDate).
If you're concerned that this code is going to bring all the SalesOrder objects over to your application and then sort them, don't be -- if you look at the SQL statement being issued by Entity Framework you can see that only one SalesOrder is returned from the database server.
As I said, if you're selectively retrieving collections at the end of your entity class's navigation properties and not using the DbCollectionEntry object, then you're almost going out of your way to make your code run slower.
And that would be bad.
Peter Vogel is a system architect and principal in PH&V Information Services. PH&V provides full-stack consulting from UX design through object modeling to database design. Peter tweets about his VSM columns with the hashtag #vogelarticles. His blog posts on user experience design can be found at http://blog.learningtree.com/tag/ui/.