C# Corner
Make Your Code Clear
There are multiple ways to solve every problem. Strive for code that communicates your intent and makes your meaning clear for every developer who uses it.
It's often not that hard to create code that works. What often separates average code from high-quality code is how well that code -- especially its public interface -- describes its own capabilities. A small time investment in your code will save time explaining how it works and avoid enhancing it unnecessarily, creating more work for yourself and your users.
In this article, I'll walk you through a review of a small library, discussing a series of changes and the motivation behind those changes. You'll see how to improve the resiliency and quality of a codebase.
A Working Numeric Library
Listing 1 shows a small numeric class that performs a few simple calculations on a sequence of numbers. This library has some simple mathematical functions: mean, median, variance, minimum and maximum. It's not a major library, but there are enough methods here to demonstrate the concepts involved in shaping the impression your code can give to customer developers.
This library works, but there are many areas where the code can be improved. When developers look at your API, they'll create an impression of the library based on your stated API. If they're your team members and they examine your source, they'll continue to build on those assumptions. Your code gives those client developers an impression of your code. You're going to spend your time making its intent clearer, not fixing behavior.
The first problem is that all of the methods on this class are instance methods. There's no reason for instance methods on this class. The first change is to make all the methods static instead of instance methods:
public static double Mean(List<double> sequence)
After this change, your customers no longer need to create a NumericAlgorithm object in order to use these methods. Instead of writing this:
NumericAlgorithm target = new NumericAlgorithm();
List<double> sequence = new
List<double>(Enumerable.Range(1, 50).
Select(n =>(double)n));
double actual = target.Mean(sequence);
After this change, they can write this simpler version:
List<double> sequence = new
List<double>(Enumerable.Range(1, 50).
Select(n =>(double)n));
double actual = NumericAlgorithm.Mean(sequence);
One less line of code, and that means one less bit of work for every use of your library. Of course, this first change leads to the obvious change of making the NumericAlgorithms class a static class:
public static class NumericAlgorithm
Doing so prevents any customer from accidentally creating a NumericAlgorithm class. It also prevents library maintainers from creating instance data or instance methods in the class.
These are small changes, to be sure, but they do prevent users from accidentally doing the wrong thing.
Next, you should look at the current API and consider its limitations. All the APIs use List<double> as the input sequence. That's unnecessarily limiting. The internals of the methods don't rely on any capabilities beyond IEnumerable <double>. You can see the costs of this restriction on the sample code. Instead of writing this:
List<double> sequence = new
List<double>(Enumerable.Range(1, 50).
Select(n =>(double)n));
Users should be able to write the simpler version:
IEnumerable<double> sequence =
Enumerable.Range(1, 50).Select(n =>(double)n);
This version allows users to call your library using any type of collection: a list, an array or even a dynamically read sequence as I'm using above.
This changes the signatures of all the methods in the class:
public static double Mean(IEnumerable<double> sequence)
In addition, you'll need to change the internal implementation of some of the methods. In particular, the Mean, Median and Variance methods now must generate both the sum and the count of elements. That's because IEnumerable<T> doesn't contain a Count property where List<T> does.
At this point, you'll notice that some of the methods might need additional work. The Mean() and Variance() methods must iterate the collection more than once: calculating the sum, the count of items and even the squares of the collection. You may be tempted to create more complicated methods to perform all those calculations in one pass. If you look at the finished version of the numeric library, you'll notice that I didn't do that extra work. I ran some performance tests and found that making those changes just didn't change the performance metrics by any appreciable amount. Therefore, I went with the simpler implementation. Your mileage may vary, but test performance before making changes with the intent of improving performance.
Also, your algorithms should make the smallest set of assumptions about the parameters you need. By making fewer assumptions, you automatically get more reach from possible users. Look at your methods and determine if you need all the capabilities of a parameter type. Whenever you can, provide a less-constraining interface.
Avoiding Problems
I wrote this sample to be representative of common production code I see in libraries. Too often, many of the developers I work with create libraries from the inside out. They have strong knowledge about how they'll implement a particular set of features, and that knowledge deeply colors how they create the functionality. Those assumptions show up quickly in the test code. It's especially evident in the test code that demonstrates the successful scenarios for your library. For example, let's look again at one of the first test samples for the original library:
[TestMethod()]
public void MedianSimpleTest()
{
NumericAlgorithm target = new NumericAlgorithm();
List<double> sequence = new
List<double>(Enumerable.Range(0, 50).
Select(n => (double)n));
double expected = 25;
double actual = target.Median(sequence);
Assert.AreEqual(expected, actual);
}
Contrast that with the final version:
[TestMethod()]
public void MedianSimpleTest()
{
IEnumerable<double> sequence = Enumerable.Range(
0, 50).Select(n => (double)n);
double expected = 25;
double actual = NumericAlgorithm.Median(sequence);
Assert.AreEqual(expected, actual);
}
There's not a huge difference in size. You may not easily see benefits when you examine a single test method. Instead, seriously examine the clarity of the test code. Is it clear what the test code is trying to do? Is it clear how to use the API?
The second test, which executes the exact same actions, has less code that's unrelated to the problem at hand. It's easier to understand exactly what test is executing. (I realize that the Enumerable.Range() method may be unfamiliar if you don't use LINQ much, but that's not related to the API.) In fact, using the new version, you could replace the range call with an array, or any other storage.
I look at test code as a way to evaluate the code that client developers will need to write. When possible, I'll write the success scenario tests before I create any of the library code. That forces me to think about the problem and the solution through a client developer's eyes: What code would I want to call in order to solve a given problem? If I start by writing the library code, I'll create a library that looks like how I solved the problem, not a library that looks like how I want to use a solution.
Of course, not every project can be written that way. Too often, we're extending existing systems that don't have a test framework already in use. In that case, you'll end up working through the tasks I outlined earlier in this article: create the tests; look at the code in the tests; and modify the library until you have the API you'd like. At each step, look at your tests -- especially the success tests -- and decide if the API is as convenient as it could be. If not, you should continue to make modifications until you have an API that matches your expectations.
Structuring Your Code
Compare the initial version of the library with the final version (see Go Online for how to access a sample of the final version). Even with only a few methods, you can see more clarity in how the library will use its parameters and how it can be used. The API more clearly describes how it will be used.
Think about your own classes and examine if they communicate their intent and their use for other developers. Does their structure communicate your design intent? If not, modify the public API until it matches your assumptions about the usage of your classes.
Your code communicates your design intent to its users. It's important that you take advantage of this opportunity to communicate to the users of your code. Do it well.
About the Author
Bill Wagner, author of Effective C#, has been a commercial software developer for the past 20 years. He is a Microsoft Regional Director and a Visual C# MVP. His interests include the C# language, the .NET Framework and software design. Reach Bill at [email protected].