Practical ASP.NET

Testing Experimental Code in Production with Scientist.NET

A .NET port of the Ruby library allows for experimental testing of code that's gone to production.

In carpentry there's a saying: "Measure twice, cut once." This means that before committing to cutting a piece of wood you should measure, then measure again, before actually sawing it in half.

Normally in software development we write some new code or perhaps refactor some existing code, maybe write or modify tests, and once the tests pass we consider it ready for production (we may also perform manual testing, performance testing and so on). We can think of this as the first measurement in the "measure twice, cut once" methodology.

What if we could take a second measurement to confirm that the new code will do what we expect it to when it's made live in the production environment?

Imagine we could create an experiment that could take the existing implementation that's running in production and continue to run it, but additionally run the new code alongside it and compare the results. In the Ruby world the Scientist library allows these kinds of experiments to be undertaken. For .NET developers there's a port of the Ruby library called Scientist.NET.

The first "measure" (unit tests and so on) allows the verification of the code in the controlled, sandboxed test environment with specified test data. However, in production, there will usually be a much larger set of real-life data. Scientist.NET allows us to verify the new code works not only with new data being entered into the production system, but also the existing data. Essentially, this allows us to embrace the fact that we may have data quality problems in production that we don't know about and, hence, haven't written tests to cover.

Once the experiment is complete and we have compared the results and are confident that the second "measure" is correct, the old implementation can be removed and the new code "switched on" in production. Of course, the results might show mismatches between the outcomes of the existing and new code. If there are mismatches these can be investigated further, perhaps adding new unit tests and modifying the candidate code to account for data quality problems.

Overview of Scientist.NET
Essentially, Scientist.NET allows the adding of code that executes the existing implementation, and also one or more experimental ("candidate") versions. The results of executing the candidate version can be compared to the existing production code and then reviewed by the team to decide if/when to start using the new implementation. One key thing here is that the code that's being executed should not modify data. This is because the experimental versions are being run alongside the real code and would cause multiple modifications of the production system data.

The results of experiments can be published to a central place in order for the development team to compare the results of the experiment. Listing 1 shows a publisher that simply writes to the console window.

Listing 1: A Custom Publisher
public class ConsoleResultPublisher : IResultPublisher
{
  public Task Publish<T>(Result<T> result)
  {            
    if (result.Mismatched)
    {
      Console.WriteLine(
        $"Experiment name: '{result.ExperimentName}' resulted in mismatched results");                
      Console.WriteLine($"Existing code result: {result.Control.Value}");
                
      foreach (var candidate in result.Candidates)
      {
        Console.WriteLine($"Candidate name: {candidate.Name}");
        Console.WriteLine($"Candidate value: {candidate.Value}");
        Console.WriteLine($"Candidate execution duration: {candidate.Duration}");
      }
    }

    return Task.FromResult(0);
  }
} 

Listing 2 shows two classes that represent some existing code currently in production (CustomerStatusCalculatorExisting) and a refactored candidate version (CustomerStatusCalculatorNew) that we want to replace it with.

Listing 2: Existing Code and a Modified Candidate for Release to Production
class Customer
{
  public int Age { get; set; }
}

enum CustomerStatus
{
  Unknown,
  Standard,        
  Gold
}

static class CustomerStatusCalculatorExisting
{
  public static CustomerStatus CalculateStatus(Customer customer)
  {
    if (customer.Age >= 16 && customer.Age <= 21)
    {
      return CustomerStatus.Gold;
    }

    if (customer.Age > 21)
    {
      return CustomerStatus.Standard;
    }

    return CustomerStatus.Unknown;
  }
}


static class CustomerStatusCalculatorNew
{
  /// <summary>
  /// New implementation assumes that all production data has no customers under 16 years old
  /// </summary>        
  public static CustomerStatus CalculateStatus(Customer customer)
  {
            
    if (customer.Age <= 21)
    {
      return CustomerStatus.Gold;
    }
            
    return CustomerStatus.Standard;            
  }
}

Once the Scientist.NET NuGet package is installed the experiment-running code can be added, as shown in Listing 3.

Listing 3: Defining an Experiment to Run in Production
class Program
{
  static void Main(string[] args)
  {
    var productionData = new Customer {Age = 15};

    // Configure how we want results published
    Scientist.ResultPublisher = new ConsoleResultPublisher();

    // Run the expereiment
    CustomerStatus status = Scientist.Science<CustomerStatus>("customer status", experiment =>
    {
      // Current production method
      experiment.Use(() => CustomerStatusCalculatorExisting.CalculateStatus(productionData));

      // One or more candidates to compare result to
      experiment.Try("Awesome new simplified algorithm", () => 
        CustomerStatusCalculatorNew.CalculateStatus(productionData));
    });

    Console.ReadLine();
  }
}

When the code in Listing 3 executes, it produces the following output:

Experiment name: 'customer status' resulted in mismatched results
Existing code result: Unknown
Candidate name: Awesome new simplified algorithm
Candidate value: Gold
Candidate execution duration: 00:00:00.0003689

Notice in this output the current production code is calculating a value of "Unknown," but because of the assumption that all data in the system is valid (namely that there are no customers under 16 years of age) the experiment shows a mismatch. The new candidate implementation in the CustomerStatusCalculatorNew class doesn't work correctly with the existing production data.

This second "measure" enables us to fix the problem before sawing the wood in half (breaking the production system by deploying buggy new code).

About the Author

Jason Roberts is a Microsoft C# MVP with over 15 years experience. He writes a blog at http://dontcodetired.com, has produced numerous Pluralsight courses, and can be found on Twitter as @robertsjason.

comments powered by Disqus

Featured

  • IDE Irony: Coding Errors Cause 'Critical' Vulnerability in Visual Studio

    In a larger-than-normal Patch Tuesday, Microsoft warned of a "critical" vulnerability in Visual Studio that should be fixed immediately if automatic patching isn't enabled, ironically caused by coding errors.

  • Building Blazor Applications

    A trio of Blazor experts will conduct a full-day workshop for devs to learn everything about the tech a a March developer conference in Las Vegas keynoted by Microsoft execs and featuring many Microsoft devs.

  • Gradient Boosting Regression Using C#

    Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the gradient boosting regression technique, where the goal is to predict a single numeric value. Compared to existing library implementations of gradient boosting regression, a from-scratch implementation allows much easier customization and integration with other .NET systems.

  • Microsoft Execs to Tackle AI and Cloud in Dev Conference Keynotes

    AI unsurprisingly is all over keynotes that Microsoft execs will helm to kick off the Visual Studio Live! developer conference in Las Vegas, March 10-14, which the company described as "a must-attend event."

  • Copilot Agentic AI Dev Environment Opens Up to All

    Microsoft removed waitlist restrictions for some of its most advanced GenAI tech, Copilot Workspace, recently made available as a technical preview.

Subscribe on YouTube