C# Corner

Design Apps for Future Flexibility

You can't predict change, but you can prepare for it. Learn how to avoid cases where you need to remove work and rework too much of what you've already done.

Technology Toolbox: C#

You can't predict what features you'll need to add in the future. In fact, one of the more important tenets of agile development is to avoid adding features before they're needed. It's more work, and you're probably wrong anyway. And yet, you know changes are coming. New requirements will be added. New features needed. New capabilities to be exploited. You can't predict, but you can prepare for it. More importantly, you can avoid cases where you need to remove and rework too much of what's already been done.

Coding for change isn't about implementing features you might need in the future; rather, it's about implementing the features you do need in such a way that the next release is an addition, not a series of breaking changes followed by the requested additions. It's about avoiding practices that will make more work for you in the future.

There are a lot of ways to get this wrong, so I'll explain several mistakes developers commonly make, as well as how to avoid them. Every public and protected member you create is a maintenance burden in future releases.

Once your users are working with something, it's hard to take features away from them. The more you give them, the more you need to support. Every convenience method means more code that you must carry around from this point on. Getting carried away with extra public members can be hard on your users as well. The more you add, the more confusing your public API becomes. These extra convenience methods aren't adding value. They merely force your users to spend more time figuring out which one of the multiple alternatives is right for a particular situation. For example, what would you conclude from these APIs:

public double this[string row, string column];
public d ouble this[int row, int column];
public IEnumerable<double> AllValues();
public IEnumerable<double> RowValues(int row);
public IEnumerable<double> RowValues(string row);
public IEnumerable<double> ColumnValues(int col);
public IEnumerable<double> ColumnValues(string col);

This code includes a lot of redundancy. The class wraps a two-dimensional structure of numbers, which you can access by row and column. The problem is that every API is doubled. You can use either the numeric index, or a string label to represent the rows and columns. But which is better? The extra work does nothing but create confusion for your users (see Listing 1).

From the implementation, you can see that the version using integers would be the better method to use whenever possible. So, why would you give your users the option of choosing the less effective method? All you do is make it easier for them to create poor code.

Also, you've given yourself problems in the future. It will be rather difficult to get the same performance characteristics out of both redundant methods. However, your users will demand that the method they use be the fastest and you'll be asked to make your code so that all these extra methods produce similar results. You've given yourself quite a bit more work in a future release.

The more members you add to your public interface, the more likely it becomes polluted with redundant and confusing methods. Instead, consider the best ways to provide the functionality you're creating, and limit your public API to those members.

Of course, you can take this principle too far. If you find that users of your class are duplicating code to use your class in common scenarios, that's your signal that it's time to add those methods to your class.

Be Careful What You Expose
Your private member fields' types are an implementation detail for your class. When you expose these types to client code, you can enable the clients to access those private members directly, without using your carefully crafted API. Once that happens, the client code can modify the state of your objects. You're now stuck with that implementation detail. Changing the internal structure would mean a breaking change. For example, suppose you exposed an array of numbers in release one:

private double[] numbers = { 1, 2, 3, 4, 5 };
public double[] Numbers
   get { return numbers; }

Later, you find that you need to support adding items to the collection, so you change it to a List<double>:

private List<double> numbers = new List<double>();
public double[] Numbers
   get { return numbers.ToArray(); }

The new version, in an attempt to maintain compatibility with the previous implementation, uses the List<T>.ToArray() method. But now it returns a copy of the collection, not a reference to the internal storage. Client code that modified the values in the array now modifies the values in the copy. You've just broken your existing code. Because your original design exposed a reference to the class's internal data types, you're now forced to live with that internal storage. Any modifications to your algorithms must be constrained to match the original storage types you used and exposed to client code.

Instead, you should either allow access to a copy of your data, or return access to that data through an interface. In the code snippets that expose an array of numbers, you could choose IEnumerable<double>, ICollection<double>, or IList<double>, depending on what capabilities you want to support. Any of these would support the scenario of changing the internal storage from an array to a List<double>. Which you choose depends on which capabilities you expect to support. I'd probably try to limit support to IEnuĀ­merable<double> if the internal storage is an array, because ICollection<double> and IList<double> both support an Add method, which will throw an exception when called on an array.

You should look at your types and determine if any of the public members return access to internal storage and would exhibit this behavior. It's never a problem with value types. Those are always copied when returned. In other cases, such as databinding scenarios, you want to return access to the internal storage. However, it pays to look at your external API and see what kind of future costs you might incur with your current API design.

Beware of Side Effects
Finally, we come to side effects. Side effects are internal state changes that happen when a public member is called. They're referred to as side effects because these results aren't obviously part of the method contract. Some developers have a way of discovering and relying on side effects.

For example, suppose you write an object where state changes trigger a database update. Users will learn quickly that performance is better if they let your class handle all the database updates. They'd make changes, then update an object of your type to submit changes to the database. You've created another implied contract in your code. If you remove the database update side effect, you've broken your users' applications. Once you've introduced side effects in the public members, they become part of the implied contract on your API. You can't remove them in future versions.

Another example: Assume one of your public APIs raises an event. In this case, you code defensively and wrap the event handler invocation in a try/catch block to ensure that your code recovers even if the event handler has bugs. That's a more insidious example of a side effect. If you remove the try/catch block, you might introduce a bug that terminates the program. A previously hidden problem that exists in client code now causes the program to terminate.

Side effects can appear in many ways: reloading persistent data, re-initializing some other shared resource, and even logging. Sometimes it's reasonable for a member to do more than its stated purpose. It might be logging, or performing a lazy evaluation calculation, or even caching data to disk. But you should review those members and make sure you're willing to live with the results in the future.

It's also important to remember that you can introduce new side effects in later releases (see Listing 2). The readonly string property in this example is calculated in the constructor. Any calls to retrieve the result are lightning fast.

Let's say you find that users only retrieve the result roughly half the time they create a new HelperClass. That means they're paying for quite a lot of unnecessary calculations. So you decide to change to a lazy evaluation algorithm (see Listing 3). Now, the first call to the get accessor pays the cost of the calculation. That's a performance change, so it's not likely to get users too upset. However, there is another problem. This new version doesn't handle multiple threads properly. In a multi-threaded program, your users could easily spend twice as much time computing the result. One thread hits the accessor and starts the computation. While that computation is happening, a context switch happens. The second thread now tries to access the result. It will see the result as null and start another long computation. As long as the computation always returns the same value, this will be correct, but slower. On the other hand, if the computation relies on some outside variables, it's entirely likely that now you've introduced a bug. When you add side effects in later releases, make sure the externally observed behavior hasn't changed.

Programming for the future is not so much about writing code that anticipates every need. Rather, it's about writing code that maximizes the ways you can update and extend your types in the future. Make explicit choices about what is exposed through your public API, and ensure that client code can't do more than you expect.

About the Author

Bill Wagner, author of Effective C#, has been a commercial software developer for the past 20 years. He is a Microsoft Regional Director and a Visual C# MVP. His interests include the C# language, the .NET Framework and software design. Reach Bill at [email protected].

comments powered by Disqus


Subscribe on YouTube