Practical ASP.NET

Testing Experimental Code in Production with Scientist.NET

A .NET port of the Ruby library allows for experimental testing of code that's gone to production.

In carpentry there's a saying: "Measure twice, cut once." This means that before committing to cutting a piece of wood you should measure, then measure again, before actually sawing it in half.

Normally in software development we write some new code or perhaps refactor some existing code, maybe write or modify tests, and once the tests pass we consider it ready for production (we may also perform manual testing, performance testing and so on). We can think of this as the first measurement in the "measure twice, cut once" methodology.

What if we could take a second measurement to confirm that the new code will do what we expect it to when it's made live in the production environment?

Imagine we could create an experiment that could take the existing implementation that's running in production and continue to run it, but additionally run the new code alongside it and compare the results. In the Ruby world the Scientist library allows these kinds of experiments to be undertaken. For .NET developers there's a port of the Ruby library called Scientist.NET.

The first "measure" (unit tests and so on) allows the verification of the code in the controlled, sandboxed test environment with specified test data. However, in production, there will usually be a much larger set of real-life data. Scientist.NET allows us to verify the new code works not only with new data being entered into the production system, but also the existing data. Essentially, this allows us to embrace the fact that we may have data quality problems in production that we don't know about and, hence, haven't written tests to cover.

Once the experiment is complete and we have compared the results and are confident that the second "measure" is correct, the old implementation can be removed and the new code "switched on" in production. Of course, the results might show mismatches between the outcomes of the existing and new code. If there are mismatches these can be investigated further, perhaps adding new unit tests and modifying the candidate code to account for data quality problems.

Overview of Scientist.NET
Essentially, Scientist.NET allows the adding of code that executes the existing implementation, and also one or more experimental ("candidate") versions. The results of executing the candidate version can be compared to the existing production code and then reviewed by the team to decide if/when to start using the new implementation. One key thing here is that the code that's being executed should not modify data. This is because the experimental versions are being run alongside the real code and would cause multiple modifications of the production system data.

The results of experiments can be published to a central place in order for the development team to compare the results of the experiment. Listing 1 shows a publisher that simply writes to the console window.

Listing 1: A Custom Publisher
public class ConsoleResultPublisher : IResultPublisher
{
  public Task Publish<T>(Result<T> result)
  {            
    if (result.Mismatched)
    {
      Console.WriteLine(
        $"Experiment name: '{result.ExperimentName}' resulted in mismatched results");                
      Console.WriteLine($"Existing code result: {result.Control.Value}");
                
      foreach (var candidate in result.Candidates)
      {
        Console.WriteLine($"Candidate name: {candidate.Name}");
        Console.WriteLine($"Candidate value: {candidate.Value}");
        Console.WriteLine($"Candidate execution duration: {candidate.Duration}");
      }
    }

    return Task.FromResult(0);
  }
} 

Listing 2 shows two classes that represent some existing code currently in production (CustomerStatusCalculatorExisting) and a refactored candidate version (CustomerStatusCalculatorNew) that we want to replace it with.

Listing 2: Existing Code and a Modified Candidate for Release to Production
class Customer
{
  public int Age { get; set; }
}

enum CustomerStatus
{
  Unknown,
  Standard,        
  Gold
}

static class CustomerStatusCalculatorExisting
{
  public static CustomerStatus CalculateStatus(Customer customer)
  {
    if (customer.Age >= 16 && customer.Age <= 21)
    {
      return CustomerStatus.Gold;
    }

    if (customer.Age > 21)
    {
      return CustomerStatus.Standard;
    }

    return CustomerStatus.Unknown;
  }
}


static class CustomerStatusCalculatorNew
{
  /// <summary>
  /// New implementation assumes that all production data has no customers under 16 years old
  /// </summary>        
  public static CustomerStatus CalculateStatus(Customer customer)
  {
            
    if (customer.Age <= 21)
    {
      return CustomerStatus.Gold;
    }
            
    return CustomerStatus.Standard;            
  }
}

Once the Scientist.NET NuGet package is installed the experiment-running code can be added, as shown in Listing 3.

Listing 3: Defining an Experiment to Run in Production
class Program
{
  static void Main(string[] args)
  {
    var productionData = new Customer {Age = 15};

    // Configure how we want results published
    Scientist.ResultPublisher = new ConsoleResultPublisher();

    // Run the expereiment
    CustomerStatus status = Scientist.Science<CustomerStatus>("customer status", experiment =>
    {
      // Current production method
      experiment.Use(() => CustomerStatusCalculatorExisting.CalculateStatus(productionData));

      // One or more candidates to compare result to
      experiment.Try("Awesome new simplified algorithm", () => 
        CustomerStatusCalculatorNew.CalculateStatus(productionData));
    });

    Console.ReadLine();
  }
}

When the code in Listing 3 executes, it produces the following output:

Experiment name: 'customer status' resulted in mismatched results
Existing code result: Unknown
Candidate name: Awesome new simplified algorithm
Candidate value: Gold
Candidate execution duration: 00:00:00.0003689

Notice in this output the current production code is calculating a value of "Unknown," but because of the assumption that all data in the system is valid (namely that there are no customers under 16 years of age) the experiment shows a mismatch. The new candidate implementation in the CustomerStatusCalculatorNew class doesn't work correctly with the existing production data.

This second "measure" enables us to fix the problem before sawing the wood in half (breaking the production system by deploying buggy new code).

About the Author

Jason Roberts is a Microsoft C# MVP with over 15 years experience. He writes a blog at http://dontcodetired.com, has produced numerous Pluralsight courses, and can be found on Twitter as @robertsjason.

comments powered by Disqus

Featured

Subscribe on YouTube