Practical .NET
The Best Option for Selecting Related LINQ Objects
Here's the best performing option when you're retrieving the objects at the end of an entity class's navigation property, either when you only want some of the objects or when you only want them some of the time.
This is the second (or third?) option I've found for retrieving related Entity Framework objects in a LINQ query, either when you only want some of the related objects or when you only want them some of the time. The problem I'm addressing is when, for example, you want to retrieve only some of the SalesOrders for a Customer or want to retrieve SalesOrders only for some Customers.
In these scenarios, for the best possible performance, you want to reduce the data transferred by retrieving only the SalesOrders you need. And, for best performance, you also want to retrieve all the rows (both Customers and SalesOrders) in a single trip to the database. I also assume you want to achieve these goals with readable code.
The simplest, most effective way to do that is to incorporate the LINQ queries that retrieve the related rows into your Select statement.
For example, assume that you want to retrieve Customers objects with their related SalesOrders -- but only those SalesOrders with a ShipDate in the future. To achieve that, in the Select clause, you build an anonymous object with two properties: one to hold the Customer object and one to hold all of the Customer's future SalesOrders. To implement this solution all you have to do is load the SalesOrders property from a LINQ statement in the Select clause, like this:
From c In db.Customers
Select New With {.Customer = c,
.SalesOrders = (From so In c.SalesOrders
Where so.ShipDate > DateTime.Now
Select so)}
It gets better: You're not restricted to using properties tied to the object at the end of the navigation property. This query retrieves all the Customers but only retrieves SalesOrders for the Customers whose Valid property is set to True:
From c In c.Customer
Select New With {.Customer = c,
.SalesOrders = (From so In c.SalesOrders
Where c.Valid
Select so)}
The good news is that the SQL statement that's issued in both cases gives you the best possible performance, at least with SQL Server: The SQL statement joins the Customer and SalesOrder tables together while integrating the Where clauses to retrieve all (and only) the rows you want. And does it in one trip to the database.
I think it's pretty readable, too.
About the Author
Peter Vogel is a system architect and principal in PH&V Information Services. PH&V provides full-stack consulting from UX design through object modeling to database design. Peter tweets about his VSM columns with the hashtag #vogelarticles. His blog posts on user experience design can be found at http://blog.learningtree.com/tag/ui/.