Practical .NET

Efficiently Loading Related Objects in Entity Framework

If you're not using the Entity Framework DbCollectionEntry object when working with an entity class's related objects, then your application is running too slow. Using DbCollectionEntry lets you asynchronously retrieve related objects and get only the objects you want.

In an earlier column on lazy loading in Entity Framework (EF), I implicitly endorsed "explicit loading" as a preferable alternative to lazy loading. With explicit loading, when you want to retrieve the objects at the end of a navigation property, you should do three things:

  1. Retrieve the object with the navigation property
  2. Check to see if the navigation property is loaded
  3. Issue a line of code to retrieve the related objects if they're not there

That code might look something like this:

Dim custs = (From c In db.Customers
             Select c).ToList()
For Each cust As Customer In custs         
  If Not db.Entry(cust).Collection(Function(c) c.SalesOrders).IsLoaded Then
         db.Entry(cust).Collection(Function(c) c.SalesOrders).Load()
  End If
Next

(For a discussion of why I'm using a ToList here, see my tip from earlier this month.)

If you compare this code to what I used in that earlier column, you'll see that I've switched to using the Collection method (my earlier code used a method called Reference). I did that for two reasons. One reason is that using Collection is compatible with EF Core 1.1 and using Reference is not. This might not be something that you worry about.

The other reason I switched is that this code gives me access to the EF DbCollectionEntry object which is a change worth making: I can use DbCollection to speed up my application whenever I use explicit loading. Before you get excited about that, though, let me put in a caveat: If you are going to work with one entity class's related objects (those objects at the end of a navigation property), then your first choice should be to use the Include keyword/method when retrieving your objects (I discussed why in that article on lazy loading).

But if I'm retrieving Customer objects and only occasionally using the objects in the SalesOrders collection, then I might want to use explicit loading. And if I'm using explicit loading, I want to use the DbCollectionEntry object.

Asynchronous Loading
The first reason to use the DbCollectionEntry object is that it lets me load the collection asynchronously.

To do that, I first have to get to the DbCollectionEntry object. That's easy: I just need to catch the object returned by the Collection method in my earlier code. I can also catch the DbEntityEntry object that returned by the Entry method. If I catch both my objects, I can simplify my earlier code to something like this:

Dim custEntity As DbEntityEntry(Of Customer)
custEntity = db.Entry(cust)
Dim salesOrderEntity As DbCollectionEntry
salesOrderEntity = custEntity.Collection(Function(c) c.SalesOrders)

If Not salesOrdersEntity.IsLoaded Then
  salesOrdersEntity.Load();
End If

But now that I have the dbCollectionEntry object, I probably don't want to use the Load method. Instead, I can improve the responsiveness of my application by using the LoadAsync method to have my collection objects loaded in parallel. This does assume that I'm willing to wait a while before using the SalesOrder objects, though.

My first step in this process is to call the LoadAsync method as soon as I can determine that I'll need the SalesOrders. In that code, I'll also catch the Task object that LoadAsync returns. I would use this code as soon as I retrieve a Customer where I intend to work with the SalesOrders collection:

If cust.IsValid Then
  Dim asyncOrders As Task
  asyncOrders = Nothing
  If Not salesOrdersEntity.IsLoaded Then
    asyncOrders = custEntity.Collection(Function(c) c.SalesOrders).LoadAsync()
  End If
End If
'...work with the Customer object...

As that final comment suggests, now that I've kicked off the process of retrieving the SalesOrders, I'll continue to work with my Customer object while the SalesOrders collection is loading asynchronously.

When I run out of things to do with the Customer object and have to start working with the SalesOrders collection, I need to do three things:

  1. Check if I needed to load the SalesOrders collection. If my asyncOrders Task object is set to Nothing, then I didn't call LoadAsync (presumably because the SalesOrder collection was already loaded)
  2. If asyncOrders is not Nothing, I need to check the Task's IsCompleted property to find out if the SalesOrders collection has finished loading (technically, I should also check the IsFaulted and IsCancelled properties)
  3. If asyncOrders is not Nothing and not completed, then I need to wait for the collection to load. I can do that by calling the Task object's Wait method (optionally, I can pass the Wait method a timeout value if I'm worried that the collection might take "too long" to load)

Putting all that together, I insert this code just before the code what works with the SalesOrders collection:

If asyncOrders <> Nothing AndAlso
   Not asyncOrders.IsCompleted Then
  asyncOrders.Wait();
End If
'...work with SalesOrders...

Loading Just the Objects You Want
While the LoadAsync method will make my application more responsive, it won't reduce the amount of work being done. One way to reduce the work being done (and, as a result, speed up my application) is to retrieve only the objects in the collection that I need. I tackled this problem earlier using the LINQ SQL-like syntax. However, once you have access to the DbCollectionEntry object, you have another option: the Query method.

The DbCollectionEntry object's Query method returns the LINQ query that's used to retrieve the items in the collection. Like any other LINQ query, you can modify this query by calling an extension method from it. For example, if you only want the First object in the collection, you could write this code:

SalesOrder so
so = custEntity.Collection(Function(c) c.SalesOrders).
 Query().
 First()

A more likely scenario is that you want to retrieve some, but not all, SalesOrders. This code retrieves just the SalesOrders with a shipping date in the future:

var sos = custEntity.Collection(Function(c) c.SalesOrders).
      Query().
      Where(s => s.ShipDate > DateTime.Now)

Sometimes you don't need the whole SalesOrder. This query would return the date of the SalesOrder with the latest ShipDate:

decimal MaxValue = custEntity.Collection(Function(c) c.SalesOrders).
Query().
Max(s => s.ShipDate)

Of course, if you want the SalesOrder with the latest ShipDate, you can just sort the SalesOrders in descending order by ShipDate and take the first SalesOrder. It's just a matter of stringing together more extension methods:

so = custEntity.Collection(c => c.SalesOrders).
 Query().
 OrderByDescending(s => s.ShipDate).
 First();

If you're concerned that this code is going to bring all the SalesOrder objects over to your application and then sort them, don't be -- if you look at the SQL statement being issued by Entity Framework you can see that only one SalesOrder is returned from the database server.

As I said, if you're selectively retrieving collections at the end of your entity class's navigation properties and not using the DbCollectionEntry object, then you're almost going out of your way to make your code run slower.

And that would be bad.

About the Author

Peter Vogel is a system architect and principal in PH&V Information Services. PH&V provides full-stack consulting from UX design through object modeling to database design. Peter tweets about his VSM columns with the hashtag #vogelarticles. His blog posts on user experience design can be found at http://blog.learningtree.com/tag/ui/.

comments powered by Disqus

Featured

  • Creating Reactive Applications in .NET

    In modern applications, data is being retrieved in asynchronous, real-time streams, as traditional pull requests where the clients asks for data from the server are becoming a thing of the past.

  • AI for GitHub Collaboration? Maybe Not So Much

    No doubt GitHub Copilot has been a boon for developers, but AI might not be the best tool for collaboration, according to developers weighing in on a recent social media post from the GitHub team.

  • Visual Studio 2022 Getting VS Code 'Command Palette' Equivalent

    As any Visual Studio Code user knows, the editor's command palette is a powerful tool for getting things done quickly, without having to navigate through menus and dialogs. Now, we learn how an equivalent is coming for Microsoft's flagship Visual Studio IDE, invoked by the same familiar Ctrl+Shift+P keyboard shortcut.

  • .NET 9 Preview 3: 'I've Been Waiting 9 Years for This API!'

    Microsoft's third preview of .NET 9 sees a lot of minor tweaks and fixes with no earth-shaking new functionality, but little things can be important to individual developers.

  • Data Anomaly Detection Using a Neural Autoencoder with C#

    Dr. James McCaffrey of Microsoft Research tackles the process of examining a set of source data to find data items that are different in some way from the majority of the source items.

Subscribe on YouTube