Using Expression Trees in Your APIs -- Visual Studio Magazine

Using Expression Trees in Your APIs

How to translate C# code into expression trees to eliminate strings, standardize parameter validations and interact with other data structures.

By Patrick Steele
04/01/2011

You've probably used expression trees before, but may not have realized it. Expression trees offer a convenient way to "examine" and "take apart" a lambda expression to find its parts. You can even execute the lambda (or not!) depending on your needs.

Are you using expression trees today? If you've used any mocking frameworks like Rhino.Mocks or Moq, you've probably written code like this:

parser.Expect(p =>p.Process());

This is setting an expectation that the method "Process" on the variable "parser" will be called during the test. The Process method is not called at this time (more on that later).

Or maybe you're using the new Fluent NHibernate library to do your nHibernate configuration without XML files:

public class EmployeeMap : ClassMap<Employee>
{
  public EmployeeMap()
  {
    Id(x =>x.Id);
    Map(x =>x.FirstName);
    Map(x =>x.LastName);
    References(x =>x.Store);
  }
}

That's right -- it's using expression trees.

So what exactly is an expression tree? MSDN documentation defines it as follows:

"The expression tree is an in-memory data representation of the lambda expression. The expression tree makes the structure of the lambda expression transparent and explicit. You can interact with the data in the expression tree just as you can with any other data structure."

That last part is important. By giving you a data structure that represents the entire lambda expression, you can do all sorts of things with it.

Let's start with a simple example. Suppose we had a method that takes two strings and combines them in some way to produce a new string:

Func<string, string, string> combine = (a, b) =>a.ToLower() + b.ToUpper();
var one = "One";
var two = "Two";
var combined = combine(one, two);

After running this code, the variable "combined" will have the value "oneTWO." Now, instead of defining a lambda for the combine method, let's build an expression tree that represents the lambda itself:

Expression<Func<string, string, string>> tree = (a, b) =>a.ToLower() + b.ToUpper();

What can you do with this? With a few lines of code, you can examine the structure of this lambda:

Console.WriteLine("Paramter Count: {0}", tree.Parameters.Count);
foreach (var param in tree.Parameters)
{
  Console.WriteLine("\tParameter Name: {0}", param.Name);
}

var body = (BinaryExpression) tree.Body;
Console.WriteLine("Binary Expression Type: {0}", body.NodeType);
Console.WriteLine("Method to be called: {0}", body.Method);
Console.WriteLine("Return Type: {0}", tree.ReturnType);

Running this code produces:

Paramter Count: 2
  Parameter Name: a
  Parameter Name: b
Binary Expression Type: Add
Method to be called: System.StringConcat(System.String, System.String)
Return Type: System.String

As you can see, very little code is needed to determine:

The number of parameters
The names of those parameters
The return type of the lambda

Because this example is a BinaryExpression, we also see that it's an "Add" of two other expressions and that the String.Concat is the method that's going to be used for the "Add."

If you're interested in diving a little deeper, check out the "Expression Tree Basics" blog posting written by Charlie Calvert, the Microsoft community program manager for the C# group.

Using Expressions to Eliminate Strings
The Castle MonoRail project is a popular Model-View-Controller (MVC) framework for ASP.NET. You don't need to understand MVC or even be a Web developer to appreciate this example.

When working with a controller (a class), often times you need to "redirect" an ASP.NET request to a different method ("action" in MVC) within the controller, or possibly even a different controller. In its simplest form, the redirect method looks like this:

this.RedirectToAction("LogonFailed");

Remember, an MVC "action" is just a method. In the previous example, the framework will look for a method called "LogonFailed" and will move execution to that method, which is simple and effective.

The problem comes when you do some refactoring. Perhaps you rename the "LogonFailed" method to "UserLogonFailed." The refactor tools in Visual Studio 2010 will make this rename easy, but in the previous sample, that's a string -- it doesn't represent the name of a method to the compiler. This is the perfect place where an expression tree can help you out.

First we'll create an extension method that can be called from within our controllers:

public static void RedirectToAction<TController>(
  this TController controller, 
  Expression<Action<TController>> expression) where TController : Controller
{
}

The method takes a generic parameter which must be of type "Controller" (a base class provided by the MonoRail MVC framework). The one parameter this extension method accepts is an expression that represents an Action<> on the particular controller. Methods (actions) on a MonoRail controller must be "void" methods, so the Action<> delegate can represent any void method on the particular controller class.

Now that we've got an expression that represents a lambda for a particular method, we can examine that data structure and pull out the name of the method:

var methodCall = expression.Body as MethodCallExpression;
if (methodCall == null)
{
  throw new ArgumentException("Expected method call");
}
controller.RedirectToAction(methodCall.Method.Name);

First make sure the lambda represents a method call. If it doesn't, we need to throw an exception. If it's a method call, we call the built-in version of RedirectToAction that takes a string -- in this case, the name of the method within our lambda! Now our code can use our new extension method to redirect to a different action:

this.RedirectToAction<SampleController>(c =>c.LogonFailed());

As you can see, by including the type of controller, we get IntelliSense and auto-completion within the IDE as well as full support for all of the refactoring tools. If you use the Visual Studio rename method, this lambda expression will get renamed, too. No more search and replace of strings when you rename a method in MonoRail!

A technique similar to this one is employed in the Fluent NHibernate project (see earlier example). Before Fluent NHibernate, the "Id" property along with the "FirstName" and "LastName" properties were defined in an XML mapping file (as strings). The technique shown here is much easier and gives you full support of the Visual Studio refactoring tools.

Here's a simple example of accepting a lambda that represents access to a property. In this example, the generic type "T" represents that type of class the property belongs to:

public void Id<V>(Expression<Func<T, V>> expression)  {
  var exp = expression.Body as MemberExpression;
  if (exp == null)
  {
    throw new ArgumentException("expression must be a member expression.");
  }
  var columnName = exp.Member.Name;
  // Regular nHibernate stuff
}

Func<T,V> represents a function that accepts our class type as the first parameter and returns V (the return type of the property). This lets us maintain the strong typing in C#.

As before, we make sure that the body of the lambda represents a MemberExpression. A MemberExpression represents access to a field or property. Finally, if we do have a MemberExpression, we can look at the member's name -- which represents the property name -- and plug that into nHibernate. No more hardcoded strings!

Executing Expression Trees
So now that we can use an expression to take a lambda apart and examine its constituent parts, what if we want to actually execute that lambda and get some results? That's easy.

The Expression<T> type contains the method "Compile," which, as you might expect, compiles the lambda represented by the expression. It returns a delegate that can be used to execute that lambda. Here's a simple example of compiling an expression tree into executable code:

Expression<Func<int, bool>>isEvenExpression = i => i%2 == 0;
Func<int, bool>isEven = isEvenExpression.Compile();
for(var a = 0 ; a <= 10 ; a++)
{
  Console.WriteLine("Is {0} even? {1}", a, isEven(a));
}

First, we define an expression that represents a lambda that takes an integer and returns a Boolean indicating whether the integer is even or not. We then compile that expression into an actual delegate. Finally, we can use that delegate just as you would any other delegate.

Now let's look at some things we can do with the ability to compile expression trees.

Standardized Parameter Validation
We've all written parameter validation logic. It's tedious, but necessary to ensure our applications perform properly. A simple case of making sure a couple of parameters passed in to a constructor aren't null becomes:

public DataFileParser(string filename, IColumnNames names) : base(filename)  {
  if (filename == null)
  {
    throw new ArgumentNullException("filename");
  }
  if (names == null)
  {
    throw new ArgumentNullException("names");
  }
}

We can take advantage of expression trees to standardize some of this code as well as eliminate the hardcoded parameter names -- which will not get renamed if we use the Visual Studio refactor tools to rename the parameters!

For this example, we're starting with a static Validator class, which will hold our standardized validation code. Our first step is to create a method, which accepts an expression representing access to a parameter:

public static class Validator
{
  public static void ThrowIfNull(Expression<Func<object>> expression)
  {
  }
}

As we showed in earlier examples, we need to make sure this expression represents access to a field (a parameter, in this case). Once we decide we have the proper expression, we'll compile it, execute it and make sure the result (the value of the parameter) isn't null. If it's null, we'll throw an exception and include the parameter name from the expression tree instead of hardcoding it:

public static class Validator
{
  public static void ThrowIfNull(Expression<Func<object>> expression)
  {
    var body = expression.Body as MemberExpression;
    if( body == null)
    {
      throw new ArgumentException(
        "expected property or field expression.");
    }
    var compiled = expression.Compile();
    var value = compiled();
    if( value == null)
    {
      throw new ArgumentNullException(body.Member.Name);
    }
  }
}

And now we have a simple, standardized null parameter validation to be used throughout our code. We can now change our old validation logic to use this new code:

public DataFileParser(string filename, IColumnNames names) : base(filename)
{
  Validator.ThrowIfNull(() => filename);
  Validator.ThrowIfNull(() => names);
}

Could we make this validation even more generic? Why not just define an expression that represents a Func<bool> lambda and the Boolean represents the pass/fail result of our validation logic? Let's tweak our earlier example a bit and create a new validation method:

public static void Validate(Expression<Func<bool>> expression)  {
  var body = expression.Body as MemberExpression;
  if (body == null)
  {
    throw new ArgumentException (
      "expected property or field expression.");
  }
  var compiled = expression.Compile();
  var value = compiled();
  if (!value)
  {
    throw new ArgumentException (
      "Argument failed validation", body.Member.Name);
  }
}

This seems like a more complete validation technique that could handle more situations. For example, let's make sure our "filename" parameter isn't null, isn't empty and begins with the letter "C":

Validator.Validate(() => filename != null
&&filename.Length> 0
&&filename.StartsWith("C"));

But if we run this code, we'll get the error "expected property or field expression." The lambda is now much more than just a field access (MemberExpression). It's a LogicalBinaryExpression. The contents of this binary expression could be anything. As long as it returns a bool, it's a valid lambda. But because it can contain anything, pulling out just a parameter name becomes virtually impossible.

Therefore, if you want to build up a generic validation library using expressions, create methods that can perform the validation within the library, not within the lambda. An example of this is checking for a null or empty string. We could expand on our original ThrowIfNull and create a new ThrowIfNullOrEmpty. This one will compile the lambda into a string value and then validate it:

public static void ThrowIfNullOrEmpty(Expression<Func<String>> expression)  {
  var body = expression.Body as MemberExpression;
  if (body == null)
  {
    throw new ArgumentException(
      "expected property or field expression.");
  }
  var compiled = expression.Compile();
  var value = compiled();
  if (String.IsNullOrEmpty(value))
  {
    throw new ArgumentException(
      "String is null or empty", body.Member.Name);
  }
}

You could add more methods for range validation, set validation and so on. The sample code included with this article contains a range validator.

Expression Trees and the Entity Framework
I hope this has shown you some of the power that can be achieved using expression trees to examine lambdas. If it hasn't, you may want to check out the Microsoft Entity Framework. This O/R mapper relies heavily on expression trees. The Entity Framework will expose your database as a set of classes; the properties of those classes represent columns and relationships. You can write standard LINQ to query those classes:

var highBalanceNames = context.Customers
  .Where(c =>c.Balance> 2000)
  .Select(c =>c.Name);

Those lambdas represent expression trees. The Entity Framework will "take apart" the expression trees using the same techniques shown in this article and translate what the lambda represents into actual T-SQL code! So with only basic knowledge of the Entity Framework (or LINQ to SQL or LLBLGenPro, among others), you can write simple, easy-to-read LINQ queries and let the library do the conversion to SQL, execute the results and return them in a standard format.

Get Code Download

About the Author

Patrick Steele is a senior .NET developer with Billhighway in Troy, Mich. A recognized expert on the Microsoft .NET Framework, he’s a former Microsoft MVP award winner and a presenter at conferences and user group meetings.

Printable Format

comments powered by Disqus

Featured

Decision Tree Regression from Scratch Using C#

Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of decision tree regression using the C# language. Unlike most implementations, this one does not use recursion or pointers, which makes the code easy to understand and modify.
Visual Studio's AI Future: Copilot .NET Upgrades and More

At this week's Microsoft Ignite conference, the Visual Studio team showed off a future AI-powered IDE that will leverage GitHub Copilot for legacy app .NET upgrades, along with several more cutting-edge features.
PowerShell Gets AI-ified in 'AI Shell' Preview

Eschewing the term "Copilot," Microsoft introduced a new AI-powered tool for PowerShell called "AI Shell," available in preview.
GitHub Research Claims Copilot Code Quality Gains in Addition to Productivity

GitHub says new research proves its Copilot AI tool can improve code quality, following earlier reports that said it boosts developer productivity -- and has a retort for contrarian studies.
Visual Studio Devs Demand Claude 3.5 Sonnet AI: 'Why Is VSC Always Preferred?'

The brand-new Visual Studio 2022 v17.12 lets devs specify the AI model they want to use with the baked-in GitHub Copilot, but some are clamoring for more options, such as the latest/greatest Claude 3.5 Sonnet model from Anthropic that is available in VS Code.