Practical .NET
Async Processing in EF6 and the Microsoft .NET Framework 4.5
The latest version of Entity Framework makes it easier to write asynchronous code. Here's how to write that code, and more important, where you'll actually find it useful.
There's one feature of Entity Framework (EF) 6 that's only available in the Microsoft .NET Framework 4.5 or later: asynchronous processing. In this column, I'm going to look at some typical scenarios where you can use EF asynchronous processing and show how you might use it.
First Scenario: Processing Multiple Requests in Parallel
Often, when you first display a form or a page, you need to display some initial set of data and also fill some combo boxes or dropdown lists. With asynchronous processing you can issue the request for the grid's data and, while waiting for that data to show up, go on to issue the requests for the combo box data, overlapping the requests. Even if the database server processes your requests sequentially, you're no worse off than if you submitted the requests in sequence. But, if the database server does process your requests in parallel, your users might get their data considerably earlier. Best of all, this is a scenario that doesn't add much complexity to your application. This could be your first big win in asynchronous processing.
But
EF doesn't support processing multiple requests through the same DbContext object. If your second asynchronous request on the same DbContext instance starts before the first request finishes (and that's the whole point), you'll get an error message that your request is processing against an open DataReader. I'll look at some ways to handle this issue after looking at how to issue overlapping asynchronous requests.
A little pre-retrieval work is required. First, you'll need an Imports or using statement for System.Data.Entity to pick up the asynchronous extension methods that EF6 supplies. You should also disable the controls you're filling so the user can't interact with them while you're loading data.
Managing Overlapping Retrievals
Now you're ready to write the LINQ statement that will retrieve the grid data (I'm assuming that's the query that will take the longest, so it should be started first). To force the LINQ statement to actually issue the request to the database server (and to convert the results into a List so the grid can use it), you'd normally call the ToList method. With EF6, however, you can use the ToListAsync method, which will retrieve the data and convert it to a List on a background thread. If you just want to retrieve the data but don't want to convert the result to a List, you can use the Load or LoadAsync methods.
The ToListAsync method returns a Task object, so your initial code might look like this:
Dim GridTask As Task
GridTask = (From cust In ctx.Customers
Select cust).ToListAsync()
When that request returns, I want to take the List and pass it to the grid. To have that happen as soon as the request completes, I use the ContinueWith method. The ContinueWith method accepts a lambda expression, passing the Task object from the ToListAsync method to the expression (the ContinueWith method also returns a Task object so I'll still end up with a Task object in my GridTask variable). The Task object passed to the ContinueWith method has a Result property that holds the output of the ToListAsync method: in this case, that's my List of Customers entities.
Unfortunately, I can't just update the grid in my form with that List, because the grid is running on my application's default thread and my asynchronous query is not. Sadly, you can't just naively pass data across threads. Fortunately, the .NET Framework provides the Dispatcher class, which can pass data across threads through its BeginInvoke method. The BeginInvoke method accepts a lambda expression that allows you to specify what you want done. In my case, I want to update the grid with the List and then enable the grid.
Integrating ContinueWith and the Dispatcher into my code gives this:
GridTask = (From cust In ctx.Customers
Select cust).ToListAsync().
ContinueWith(
Sub(tk)
Dispatcher.BeginInvoke(
Sub()
Me.CustomersDataGrid.ItemsSource = tk.Result
Me.CustomersDataGrid.IsEnabled = True
End Sub)
End Sub)
After writing this block of code, you can add similar blocks for the other controls on the form. To prevent the "open DataReader problem," each request will require its own instance of the context object:
Dim ctxCombo As New SalesorderContext
ComboTask = (From cust In ctxCombo.Customers
Select cust.Id).ToListAsync().
ContinueWith(...
Depending on how long it takes to instantiate each of these context objects, you could lose whatever benefits asynchronous processing is giving you. You might want to consider defining multiple DbContext classes in your application, with each DbContext class containing just the entity you need to fill one or more of your dropdown lists.
In a Web application, you don't want to send the page to the browser until all your asynchronous requests have completed. So, before sending the page to the browser, you need to call the Wait method on each Task object returned from your requests. That ensures your method doesn't complete until all of your requests do:
GridTask.Wait()
ComboTask.Wait()
Second Scenario: Background Processing
Once you get your data, you need to process the individual objects that are returned. If you do this in a For
Each loop, the user will have to wait until you've finished with all the items, in sequence, before he'll be able to play with your UI. If you use the EF6 ForEachAsync method, the user may not have to wait as long to use your UI because the loop may be processed on a background thread (whether you get any parallel processing will depend on things beyond your control -- the number of available cores in the user's processor, for instance). Unlike ToListAsync, however, the Task object returned by ForEachAsync doesn't have a Result property that gives you access to the collection it's processing.
This example updates each item returned by the query, and once processing is complete, updates the database with the results. The context object used in this processing will be tied up until the process is complete, so if you think you might need to make another EF request before this one finishes, you should use a dedicated context instance, as this example does:
Dim bgCtx As New SalesorderContext
LoopTask = (From cust In bgCtx.Customers
Select cust).
ForEachAsync(
Sub(cust As Customer)
cust.LastName = "Smith"
End Sub).
ContinueWith(
Sub()
bgCtx.SaveChanges()
End Sub)
In my ContinueWith method I could also have updated my grid using the Dispatcher class.
In server-side Web processing, because you typically don't want to return the page until processing is complete, this technique may not be useful unless you have multiple, independent updates whose processing you can overlap. And AJAX requests are made asynchronously by default, of course, often eliminating the need to write asynchronous server-side code.
In a desktop application, on the other hand, if there's something else the user can do while your ForEachAsync is processing, this could be a useful technique, even if you don't have multiple updates to make.
Canceling Processing
Once you're doing updates asynchronously, you need to consider what you'll do if the user wants to exit the application before your updates are complete. First, of course, you have to determine if any of your asynchronous tasks are still executing. The Task objects returned by the async methods will let you determine if the related task is complete. This code in a Windows Presentation Foundation (WPF) application keeps the user from closing the window until the processing associated with the Task object in the LoopTask variable is complete:
Private LoopTask As New Task
Private Sub MainWindow_Closing(sender As Object,
e As ComponentModel.CancelEventArgs) Handles Me.Closing
If LoopTask.IsCompleted = False Then
MessageBox.Show("Please wait for processing to complete")
e.Cancel = True
Else
e.Cancel = False
End If
End Sub
The LoopTask variable must be declared at the class level so that it can be shared between whichever method generated the Task object and the Closing event.
You can offer the user the ability to cancel asynchronous processing by passing a cancellation token to the async method. The first step is to instantiate a CancellationTokenSource for each Task (or group of Tasks) you want to cancel (again, declared at the class level):
Private LoopTokenSource As New CancellationTokenSource
Then you must pass the Token from that CancellationTokenSource to the async method (all the async methods accept cancellation tokens). The ForEachAsync method accepts a cancellation token as its second parameter, so a "cancellable" version of my ForEachAsync code looks like this:
LoopTask = (From cust In bgCtx.Customers
Select cust).ForEachAsync(
Sub(cust As Customer)
...omitted code...
End Sub, LoopTokenSource.Token)
Now, in the Click event of some Cancel button, you can call the CancellationTokenSource Cancel method to let the user cancel your asynchronous processing:
LoopTokenSource.Cancel()
If you want to allow the user to try the task again, you'll need to re-initialize the variable holding your cancellation token, either after calling the Cancel method or just before passing it to the async method. If the Cancel method is called while ADO.NET processing is taking place, you may raise an exception. To handle that you'll need to wrap the code in your lambda expressions in a Try
Catch block.
Third Scenario: Saving Changes
SaveChangesAsync processes all the updates known to the DbContext object on a background thread. For instance, as a user makes changes to rows in the grid in a desktop application, you might call SaveChangesAsync as the user leaves each row. Unlike calling SaveChanges, which would force the user to wait until the update was complete before starting work on the next row, the user should be able to immediately start working on the next row.
Because you need a single context object to manage all the rows in the grid, you can't have overlapping SaveChanges operations running and you can't use multiple context objects. But unless your user's making changes to every row (and, even then, only if your user is very quick), it will probably be an unusual case when one row's changes are still being processed when it comes time to save the next row's changes. Code like this in a WPF DataGrid RowEditEnding method would cause the user to wait, on the odd occasion that happens:
Dim SaveTask As Task
Private Sub CustomersDataGrid_RowEditEnding(
If SaveTask IsNot Nothing Then
SaveTask.Wait()
End If
SaveTask = ctx.SaveChangesAsync()
End Sub
This code does assume your changes are always saved successfully. You'll need to consider how you'll handle errors if, for instance, you're allowing your user to page forward in the grid after making a change.
But Wait! There's More
There are about 18 extension methods that can be used on LINQ collections that support asynchronous processing, including asynchronous versions of popular methods such as First, Count and Any. There's even one async method that can be called directly from a DbContext collection: FindAsync. This example finds the Customer object with a primary key value of 4, and displays the value of that Customer's FirstName property:
Dim ctxFind As New SalesorderContext
ctxFind.Customers.FindAsync(4).
ContinueWith(
Sub(tk)
MessageBox.Show(tk.Result.FirstName)
End Sub)
But, as with the previous methods I've discussed, the tough part of asynchronous processing is distinguishing between where it will make your application better and where it will just add complexity without value. The new EF async methods simplify implementing those solutions -- but it's still your job to figure out when to use them.
About the Author
Peter Vogel is a system architect and principal in PH&V Information Services. PH&V provides full-stack consulting from UX design through object modeling to database design. Peter tweets about his VSM columns with the hashtag #vogelarticles. His blog posts on user experience design can be found at http://blog.learningtree.com/tag/ui/.