Special Reports
7 Tips for Software Versioning
It'd be ideal to have versioning support in the core language specs for .NET and Java. But in the absence of such features, you can still do things today to build applications that show resilience in the face of change.
7 Tips for Software Versioning
Here's what you can do today to build apps that show resilience in the face of change.
by Alex Krapf
June 1, 2006
As developers, we tend to think of a software version as something represented by some metadata maintained in our version control system. This (mis)conception is responsible for a lot of grief for our users. Consider these questions: How does your version control system help you...
- Run several versions of your application concurrently, maybe even in one process?
- Ensure that changes you make to a published framework don't break the framework's users?
- Read/write data that was created by older versions of your software?
- Determine the version and version compatibility of an object or service at run time?
Remember how only a few years back, we used to think that thread-safe programming was something that only a select few had to worry about because most of us would never run into it? Well, today hyper-threaded multicore machines are appearing on every desktop, modern runtime environments are inherently multithreaded, and you'd better be familiar with writing and debugging multithreaded applications. I would say that today we are at a similar turning point with respect to "version-safe" software.
Many best practices in object-oriented programming are at odds with version-safe programming. Take the use of final (or "sealed," in .NET parlance) types, for example: A common recommendation is that any type you don't intend to be subclassed should be marked as final. What does it mean to have a final type when its implementation changes over time? That does not make the type look very "final," and a different design allowing multiple versions of the type to coexist peacefully might be more appropriate when taking versioning considerations into account.
My dream solution would put versioning support into the core language specification, be it Java or .NET, thereby allowing such things as "versioned" final types. But in the absence of such revolutionary updates, you can still do things today to build applications that show resilience in the face of change, or, as I like to say, "build applications that scale over time." Most programming languages have some specific features on which you would build your versioning support. Java, for example, has classloaders; .NET has application domains and codebases; and C++ has typedefs, templates, and dynamically loaded modules.
Due to the differences between these technologies, each language has its own versioning techniques and problems. You can find some good publications and presentations on this topic (see Resources), but they often include contradictory recommendations. Here are seven observations and recommendations, categorized as "non-controversial" and "controversial."
Non-Controversial Tip #1: Avoid using extensible implementation types as return types
Any method that returns an object should return it through an interface or at least an abstract type. The use of interfaces is actually controversial; we'll discuss why in the next section.
What does it mean if you return an extensible, concrete type from a method? You're tying yourself to the implementation of that type without guaranteeing that it won't change—a bad idea under any circumstances.
Constructors represent a special case of this rule. Some people think the flexibility benefits of using object factories are outweighed by the reduced readability and increased verbosity of the code you have to write. We will also revisit this issue in the next section.
Non-Controversial Tip #2: Include programmatically queryable version info
Whether you use .NET attributes, whether you go Java5 all the way and use a custom attribute, or whether you use a field/method doesn't matter much, but it's always a good idea to be able to "reflect" on the version of a type. This allows you to make runtime decisions based on class version, just like you can make decisions based on class type.
You might need to work a little harder to allow your versioned types to coexist in one process, but it can be done.
Non-Controversial Tip #3: Avoid public fields
Public fields that are not constants are generally a bad idea, particularly if they are declared by public classes (thereby making their usage totally uncontrollable). The only place where a public field can be used without any damage is inside a totally local scope, such as a record-keeping inner class that is not exposed beyond its outer class. Other than such carefully localized circumstances, you buy yourself a lot of trouble for a little convenience: Once the field is introduced, you can never get rid of it, and you must assume that any change to the field's value breaks an existing application.
I would extend this rule to include static constant fields because static values are hard to version inside a process. Consider Non-Controversial Tip #2 for a moment. Your natural inclination might be to have a constant like this:
public static final int VERSION = 5;
The problem is that you cannot have two different versions of one type in one classloader and query them for their version numbers. A much more flexible design would be to have a type factory that returns (versioned) singletons that have instance methods or fields representing the meta-information.
Non-Controversial Tip #4: Understand the difference between APIs and SPIs
Both APIs and SPIs are APIs in the colloquial definition of the term. SPI, a common acronym in Java-based designs that stands for "Service Provider Interface," is simply an API intended to provide an insulated implementation to a standardized public API. The end-user developer uses the nonpublic, relatively mutable SPI through a standardized, relatively stable API.
SPIs also need to be versioned, but they typically have a much smaller user audience and are used only through APIs that act as factories for the SPIs, so they already have a version-aware programming interface.
Controversial Tip #1: Use interfaces when possible
Many people don't regard the use of interfaces as a controversial recommendation, but others strongly favor abstract classes over interfaces. There are two primary issues with interfaces. First, interfaces cannot have constructors or static methods. In my personal judgment, this is not an issue because I would tend to favor factory methods over constructors anyway, and I dislike statics for a variety of reasons (see Non-Controversial Tip #3).
Second, interfaces cannot evolve. This is a more serious issue: Once an interface has been designed, it should never be modified again. Everyone understands that no method should be removed, but it is equally important that no method be added because that would break existing implementations of the interface.
The whole point of using interfaces—in the versioning context—is to allow different implementations to coexist while still maintaining a high level of abstraction and treating different implementations (versions) similarly in accordance with the contract embodied by the interface. The problem is that if the interface cannot be changed, an interface-based design can only accommodate "small" implementation changes, but not big changes to the calling contract. What to do?
In C++, you can use typedefs to help you out. Rather than coding directly to the type in question, you go through a layer of type indirection. You can thereby avoid coding directly to an interface type but rather code to a moniker for the type, which under certain circumstances could represent different types. This pattern is fairly prevalent in the Standard Template Library (STL), where instead of directly referencing a type such as forward_iterator<MyCollection<int>>, you reference the type through a nested, templatized type definition such as MyCollection::iterator.
The benefits are clear: You don't need to know the exact type of the iterator because the usage itself defines the type. Such a pattern helps mitigate the problem of dealing with mutating types on the user side, but it doesn't help in Java where you have neither templates nor typedefs, and you cannot have two different versions of one type loaded into the JVM unless they're in different classloaders.
It is fairly clear that the current version of Java forces you into some serious classloader magic if you want to do things such as having multiple versions of one interface in your process. A custom classloader together with version-mangled jarfiles can get the job done, but implementing that is pretty complex.
Controversial Tip #2: Use factory methods rather than constructors
The more software I write and maintain, the more I come to the conclusion that constructors are a bad idea in connection with software versioning. They certainly add convenience, but they also tie type knowledge, object creation, and object initialization together inseparably. Often in my Java applications, I wish for something like a polymorphic constructor (which could create an instance of a subclass based on some global state), or a placement new or a "pooled instance" constructor. All these wishes are trivial to implement using factory methods. Plus, factory methods support versioning well.
I don't know why factory methods controversial, but the most common argument is their increased verbosity. Compare the two cases:
(1) Integer i = new Integer( 4 );
(2) Integer i = IntegerFactory.create( 4 );
The second case is indeed a little harder to write, but not that much harder. And just think about the benefits:
- You don't sprinkle instantiations of a certain type all over your code. This means you have a centralized place where you could change the concrete type that is being returned. (Integer is not an interface and it is final, so the point is relatively moot there. But for types other than Integer, you could still use a custom classloader to return a version-specific implementation type from the factory method.)
- You can implement optimizations in the factory method (pooled instances, and so on).
Controversial Tip #3: Use type factories
I like this pattern a lot, but I categorize it as "controversial" because it seems like a lot of overhead. I guess it falls into the category of: "You don't need it unless you really need it." The type factory is the class that has built-in smarts about versions, code bases, classloaders, and so on. The pattern is simple. You start out with an application-specific type factory such as:
class TypeFactory
{
public MyTypeFactory getMyTypeFactory()
{
getMyTypeFactory( getCurrentVersion() );
}
public MyTypeFactory getMyTypeFactory( int version)
{
if( version == 1 || version == 2)
return new MyTypeFactory1( version );
else
throw new VersionNotSupportedException( version );
}
?
}
The concrete type factory simply provides overloaded factory methods (and possibly some additional support methods):
class MyTypeFactory1
{
public MyTypeFactory1( int version )
{
this.version = version;
}
public MyType create()
{
if( version == 1 )
return new MyType1();
else
return new MyType2();
}
public MyType create( int i )
{
if( version == 1 )
return new MyType1( i );
else
return new MyType2( i );
}
?
}
Your TypeFactory has factory methods for every versioned type in your system. Rather than returning instances of the versioned type, the methods return instances of factories for the type. This allows you to have many overloaded create() methods for each type without crowding the factory instance too much. In this simple example, the factory type method returns newly created factory instances. In a real, heavy-duty implementation, it could, for example, create a separate classloader for an older version and instantiate the instance in that classloader.
Clearly, you don't want to use this pattern for all types in your system, but it can make your life easier if you use it for select service types that are modified frequently and that might need to be maintained concurrently.
Versioning remains a challenge. We will make real progress in this field only if versioning support becomes a first-class feature of the languages we're using. Until then, we have to use best practices, patterns, and a lot of experience.
About the Author
Alexander Krapf is president and cofounder of CodeMesh Inc. He has more than 15 years of experience in software engineering, product development, and project management in the United States and Europe. Krapf has also worked for IBM, Thomson Financial Services, Hitachi, Veeder-Root, and Document Directions Inc.
About the Author
Alexander Krapf is president and cofounder of CodeMesh Inc. Krapf has more than 15 years of experience in software engineering, product development, and project management in the United States and Europe. Krapf has also worked for IBM, Thomson Financial Services, Hitachi, Veeder-Root, and Document Directions Inc.