Practical .NET

Simplifying Data Retrieval with CQRS in ASP.NET MVC

Data retrieval and updates are very different activities so the CQRS pattern says that the smart thing to do is treat them differently. Here's a variety of solutions for the retrieval side in an ASP.NET MVC application.

Some applications are simple: You just have to add, remove, update, and delete some records (perhaps edit the input data and wrap some business application logic around the database access). Other applications are, well, more complicated. What's common to all of these applications is that they first retrieve data and then update data.

What's interesting about data retrieval and update is that, regardless how complicated or simple the application is, the data retrieval part of the application is very different from the data update part.

The Difference Between Updates and Retrieval
On the retrieval side (with the possible exception of a screen for maintaining a lookup table), the data being retrieved is unstructured. At the very least, for example, the retrieved data combines parts of several different entities. A sales order update screen, for example, will typically include information from the customer, customer address, sales order header, sales order items and product entities. To make it more interesting, all of that data needs to be retrieved and delivered to the user as soon as possible.

Things are different on the update side. To begin with, while all the necessary data must be retrieved, only some of the data will be eligible for update. On my sample sales order screen, for example, much of the data will be read-only: The customer data, the product information, and much of the sales order header information will be read-only. While the user might be allowed to select among existing addresses or create a new address, existing address information will probably also be read-only.

Updates, therefore, are going to be targeted toward specific entities and, even then, only some changes will be permitted. For my sales order screen, updates are probably restricted to creating a new address, adding a new sales order item, modifying the quantity on an existing sales order item, deleting a sales order item, and updating a limited number of properties on the sales order header. Furthermore, it would be an unusual transaction that did all of those activities, unlike data retrieval where everything must be retrieved every time.

Another difference between the retrieval and the update side is immediacy: While all the data has to be retrieved and delivered to the user as soon as possible, not all of these updates made by the user need to be committed immediately. A new shipping address really isn't relevant until the order is shipped, for example, so updates to the customer address table could be deferred until later. Obviously, the customer's credit card will need to be billed … but, again, not until shipment. Various discounts and special offers may need to be applied and customer preferences updated … but, again, not right away.

With all of these differences, building a single system/call stack to handle both the retrieval (Query) and update (Commands) seems distinctly odd. More important, building a single system that would handle two such different processes efficiently would be extremely difficult to do. This is the essence of the Command Query Responsibility Separation (CQRS) pattern: You can simplify your application by creating two separate processes, each of which does one of these processes well.

In this column, I'm going to look at how simple I can make the retrieval side in an ASP.NET MVC application by not worrying about updates.

Simplest Possible Retrieval
In an ASP.NET MVC application one of the responsibilities of the Controller is to build the ModelView object that will be passed to the View. That ModelView object is almost certainly going to hold a bunch of unstructured data for the View that won't correspond to any single entity. Typically, then, that ModelView object is a kind of Data Transfer Object (DTO) that exists just long enough to get the data out of the data source and into the page's HTML.

Which raises the question of where that DTO should be built. My first choice is to make the Controller Action methods responsible for building the View DTOs. The simplest solution is for the Controller to directly access the entity model, retrieve the entity objects required and load them into a DTO without modification. In that scenario, the DTO might look this:

Public Class SalesOrderDTO
  Public Property soCustomer As Customer
  Public Property soAddresses As List(of CustomerAddress)
  Public Property soHeader As SalesOrderHeader
  Public Property soItems As List(of SalesOrderItem)
  Public Property soProducts As List(of Product)
End Class

I've ignored any lookup tables required to decode properties in these objects in this query, but I'll come back to them.

I've assumed that none of these objects have hundreds of properties or are holding BLOBs. Because of that last assumption I'm comfortable with retrieving all of the entity object, even though the View might require only some of any entity's properties (I've discussed techniques for limiting the columns to be retrieved if my assumption isn't true here and here).

The code to load that SalesOrderDTO in the Controller might look like Listing 1.

Listing 1: Initial Code for an Action Method To Retrieve Data for a View
Dim soDTO As SalesOrderDTO
soDTO = New SalesOrderDTO

Using db = New SalesOrderEntities
  soDTO.soCustomer = (From c In db.Customers
                      Where c.Id = custId
                      Select c).FirstOrDefault
  soDTO.soHeader = (From so In db.SalesOrderHeaders
                    Where so.Id = salesOrderId
                    Select so).FirstOrDefault
  ...more properties loaded...
End Using
Return View(soDTO)

While the code in Listing 1 is simple, there's quite a lot of it. Furthermore, from a performance point of view, that's an awful lot of trips to the database. I can address both of those issues by exploiting the navigation properties in the various entities that retrieve the related items.

If I take advantage of the navigation properties, the DTO shrinks to this:

Public Class SalesOrderDTO
  Public Property soHeader As SalesOrderHeader
End Class

This, of course, assumes the existence of navigation properties that link the SalesOrderHeader to both SalesOrderItems and Customers (and Customers to CustomerAddresses and so on). That's a safe assumption because, where the appropriate foreign/primary key relations don't exist in the database, I can add any additional navigation properties I need to my entity model. I can also use this feature to handle my lookup tables.

The code to load the SalesOrderDTO now shrinks to the single LINQ statement in Listing 2, which has a lot of calls to the Include method to ensure that I retrieve all of the data on one trip to the database (I've included a reference to a State lookup column to show how lookup tables would be handled).

Listing 2: Simpler Code for an Action Method To Retrieve Data for a View
Dim db As SalesOrderEntities
Dim soDTO As SalesOrderDTO

Using db = New SalesOrderEntities
  soDTO = New SalesOrderDTO
  soDTO.soHeader = (From so In db.SalesOrderHeaders _
     .Include(Function(s) s.Customer) _
     .Include(Function(s) s.Customer.CustomerAddresses) _
     .Include(Function(s) s.Customer.CustomerAddresses.Select(Function(ca) ca.State)) _
     .Include(Function(s) s.SalesOrderItems) _
     .Include(Function(s) s.SalesOrderItems.Select(Function(si) si.Product))
    Where so.Id = salesOrderId
                   Select so).FirstOrDefault
End Using
Return View(soDTO)	

In reality, I don't need all of those Includes: the Include for loading the CustomerAddresses entity through Customer will automatically load the Customer entity, for example. Furthermore, at this point, you have to wonder why I need the DTO at all -- I can simply pass the SalesOrderHeader (with all of its children) to the View. As a result, the DTO can now go away. With those changes, I'm left with this code:

Using db = New SalesOrderEntities
  soDTO = (From so In db.SalesOrderHeaders _
           .Include(Function(s) s.Customer.CustomerAddresses.Select(Function(ca) ca.State)) _
           .Include(Function(s) s.SalesOrderItems.Select(Function(si) si.Product))
           Where so.Id = salesOrderId
           Select so).FirstOrDefault
End Using
Return View(soDTO)

Including Data Retrieval Business Rules
That's probably too simple a solution, though. There are almost certainly business rules that control, for example, which SalesOrders may and may not be used. Rather than clutter up my Controller with that logic, a better design would be to create a layer between the Controller and the data source to implement those rules (in fact, even if those rules don't exist right now, you'd probably be smart to include that layer just to protect yourself in case those rules appear later).

Taking that into account, my LINQ statement in the Controller needs to call a method on some repository class that implements those business rules. In my controller, I just replace my DbContext class with my repository class. That code would look like this:

Using db = New SalesOrderRepository
  Dim soDTO = (From so In SalesOrderRepository.SalesOrders
.Include(Function(s) s.Customer.CustomerAddresses.Select(Function(ca) ca.State)) _
.Include(Function(s) s.SalesOrderItems.Select(Function(si) si.Product))                  si.Product))
               Where s.Id = salesOrderId
               Select s).FirstOrDefault
End Using
Return View(soDTO)

The repository class and its method might look something like Listing 3.

Listing 3: A Sample Repository Class
Public Class SalesOrderRepository
    Implements IDisposable
  Private db As SalesOrderEntities

  Public Sub New()
    db = New SalesOrderEntities
  End Sub

  Public Function SalesOrders() As IQueryable(Of SalesOrderHeader)
    Return From so In db.SalesOrderHeaders
           Where so.Valid = "Y"
           Select so
  End Function
 
 Private Sub IDisposable_Dispose() Implements IDisposable.Dispose
   db.dispose
 End Sub
End Class

Now, as the business rules for what SalesOrders can be used change, I just update the SalesOrders method in my repository.

If piling LINQ queries on top of LINQ queries sounds like a performance problem in the making, it's not. At least, it's not as long as the methods in the repository return an IQueryable result (and don't use ToList). If those two rules are followed, whatever LINQ query is used in the repository method will be merged with my Controller's LINQ query to create a single SQL statement to be sent to the database (effectively, with these rules, LINQ queries are "composable").

But now I have to ask where my SalesOrderRepository should live: In my application or in some middle-layer business object? My first choice is in my application's Models folder. However, the rules that control which SalesOrders are valid might be useful in multiple applications. If that's the case then SalesOrderRepository should go in a class library that can be shared with other applications.

If the business rules I need to apply are a combination of application-specific rules and company-wide rules, then I might need two levels of repositories: My Controller query would call a method in an application-level repository that would call a method on a repository in a class library. As long as all the methods return IQueryable results, I'll still end up with a single SQL statement going to the database. If the data retrieval rules are complicated, I can add additional complexity. However, by layering LINQ queries on top of each other I can prevent that complexity from building up in any one place (and even implement some level of reusability for those rules in the class library).

There's also an opportunity here to create some simplicity. For example, there might be some advantage in ease-of-use for the application developer if I move my Controller's LINQ query into a GetFullSalesOrderById method in my application repository. My Controller's Action method might then shrink to this:

Return View(SalesOrderRepository.GetFullSalesOrderById(salesOrderId))

Supporting Testing and Flattening the Result
This design also facilitates testing. I can disconnect my Controller from the database by creating a mock SalesOrderRepository object. That mock object would have a SalesOrder method that returns a List of SalesOrderObjects created inside the SalesOrder method (I'm not saying that code would be fun to write).

I can even create multiple mock versions of the SalesOrders method to handle all of my test scenarios: a version of the method that returns no objects, another version that returns objects with bad data similar to what's found in the database, and so on. These wouldn't be perfect tests because I'm now testing LINQ-To-Objects rather than LINQ-To-Entities so, eventually, I'll have to test against the real database (my tests won't reflect real-world response times, for example). However, my tests are probably "good enough" to get me through my development testing and may even be useful in my integration testing.

Separating updates from retrieval gives me some flexibility in how I retrieve my data. I might have to abandon LINQ if it turns out that I can't use it to implement some business rule (I can't filter for particular CustomerAddresses in an Include, for example). One of the beauties of the CQRS pattern is that I'm not going to be using these objects for updates so I don't really care how they're generated. I don't need, for example, the results to be organized into nicely nested Entity Framework objects if all I'm doing is using the results to generate HTML.

In that case, I might choose to have my SalesOrders method access a View that produces the right result. Using a Table Valued Function and the code generated by the Entity Framework designer I could integrate a stored procedure that held the business rules (and moving some of my data access to the database might even give me a performance boost). I've got lots of options.

But, of course, that still leaves the update side of the pattern, which is always more complicated than the retrieval side. I'll look at how I can simplify that later.

About the Author

Peter Vogel is a system architect and principal in PH&V Information Services. PH&V provides full-stack consulting from UX design through object modeling to database design. Peter tweets about his VSM columns with the hashtag #vogelarticles. His blog posts on user experience design can be found at http://blog.learningtree.com/tag/ui/.

comments powered by Disqus

Featured

  • Microsoft Revamps Fledgling AutoGen Framework for Agentic AI

    Only at v0.4, Microsoft's AutoGen framework for agentic AI -- the hottest new trend in AI development -- has already undergone a complete revamp, going to an asynchronous, event-driven architecture.

  • IDE Irony: Coding Errors Cause 'Critical' Vulnerability in Visual Studio

    In a larger-than-normal Patch Tuesday, Microsoft warned of a "critical" vulnerability in Visual Studio that should be fixed immediately if automatic patching isn't enabled, ironically caused by coding errors.

  • Building Blazor Applications

    A trio of Blazor experts will conduct a full-day workshop for devs to learn everything about the tech a a March developer conference in Las Vegas keynoted by Microsoft execs and featuring many Microsoft devs.

  • Gradient Boosting Regression Using C#

    Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the gradient boosting regression technique, where the goal is to predict a single numeric value. Compared to existing library implementations of gradient boosting regression, a from-scratch implementation allows much easier customization and integration with other .NET systems.

  • Microsoft Execs to Tackle AI and Cloud in Dev Conference Keynotes

    AI unsurprisingly is all over keynotes that Microsoft execs will helm to kick off the Visual Studio Live! developer conference in Las Vegas, March 10-14, which the company described as "a must-attend event."

Subscribe on YouTube