C# Corner
Initialize Objects Properly
Developers are accustomed to thinking of an object as either existing or not, but the truth is the initialization process is complex enough that this isn't always so.
Learn how to initialize objects properly and avoid small missteps that can lead to big problems when creating and initializing objects.
Technology Toolbox: C#
You'd like to think that an object either exists or doesn't exist. But it's not that simple. There are many actions that happen during the creation of an object. You initialize fields, initialize base class portions of an object, and initialize the object. And, if this happens to be the first instance of this object (or any of its base classes), your code does the same amount of work to initialize all the class (static) variables. Sure, you probably know all this stuff happens when you new up an object. But what's more important for a developer is that each of these actions takes time. You don't have a fully initialized object until all these actions have completed. Any code that runs during that process is acting on a partially valid object. Some fields might be missing, and some methods might not work. The object is in the process of being born.
The key to being able to write code that will be resilient during initialization is to understand the order that the C# compiler uses to initialize objects.
Let's start with the most basic operation: initializing an object (Listing 1). Take a look at the code, then try to guess the order of the initialization. Here's the output:
statically initialized was created
statically2 initialized was created
Static Ctor initialized was created
field Initializer was created
2nd field Initializer was created
Ctor Initialized was created
You can see several rules at play here: First, static field initializers execute first (for the first instance only). Second, the static constructor executes (for the first instance only). Third, instance initializers execute. Finally, the constructor executes. These rules cover both the static and instance rules for a single type. Note that there are no base classes in play here.
You can divine a couple of other important facts looking at this output. For example, the fields are initialized in the order they're declared, both for static fields and instance fields. I don't like to write code that relies explicitly on this fact, but it's documented in the ECMA standard. Essentially, the C# language needs to define an order, and initializing fields in the order they're declared is more reasonable than any other approach. The reason I don't like to count on that order is that it's brittle. If someone re-orders your code, the order in which your variables initialize changes.
This is also a good time to mention one other important point on initializers versus constructors. Initializers are great for simple types, such as the Marker class, but they have a critical limitation: Initializers don't provide any way for you to catch any exceptions that might be thrown as you initialize a field (this might not be a problem with instance variables). The code that newed up the object can catch, and it's possible that you won't create a valid object if the field initializer throws an exception. That said, the effects are downright catastrophic when a static field initializer throws an exception. The common language runtime (CLR) calls the static constructor to initialize the type and, if an exception is thrown from a static constructor, the CLR terminates the current thread. If the current thread is the main thread for your application, the CLR terminates your program.
This behavior happens any time an exception is thrown out of a static constructor, so you can create the same problem if code inside the body of your static constructor throws an exception. The difference is that you can catch exceptions that are generated inside the body of the constructor. You can't put a catch block around a static initializer.
Base Classes and Derived Classes
Inheritance hierarchies throw quite a few changes at you in terms of the initialization order. For example, consider this test code that displays the order of initialization for a derived class, including its base class (see Listing 2). The output illustrates some interesting differences between the point when field initializers execute and the point when the body of a constructor executes:
derived static - initializer was created
derived static - Ctor was created
derived instance - Initializer was created
base static - initializer1 was created
base static - initializer2 was created
base static - ctor was created
base instance - initializer was created
base instance - initializer2 was created
base instance - ctor was created
derived instance - Ctor was created
The output can take a little while to decipher until you understand the rules. The new rules are simple extensions of the rules you saw for a type without examining its base class.
Remember that the static initialization happens before any code in that class can execute. Next, remember that the initializers are executed before any code in the body of the type constructor. If you ignore the instance variables, you can now understand the static variable initialization correctly. First, the static initializers in the derived class are executed. That's because this whole sequence is initiated by creating a derived type. Before any code in any type constructor executes, the field initializers described in the derived class are executed. Next, the CLR calls the derived class' static constructor. It runs, and life is good.
At this point, the runtime can begin creating the first object of the derived type. The first step is to execute all the field initializers in the derived class. Note that the derived class instance field initializers run before the base class static fields are initialized. That's correct, because no code in the base class has been needed yet.
Once all the field initializers in the derived class execute, the body of the base class executes. This is the first time the base class is accessed. This triggers the base class type initialization process, and the base class static field initializers execute. Next, the base class type constructor executes. Once the static construction phase has completed, the base class object initialization begins. The base class's instance field initializers execute, followed by the body of the base class instance constructor. After the base class instance constructor executes, it returns and the body of the derived class's instance initializer executes.
At this point, you might wonder if there's an insidious bug lurking in the initialization sequence as I've outlined it above. You can see that the type initializers and the type constructors for the derived class both execute before the type initializers and the type constructors for the base class. Couldn't that introduce problems if the derived class's type constructor tries to access a static method or static field in the base class? Wouldn't that throw a null reference exception?
Well, the CLR is a little smarter than that. The sample in Listing 2 executes its initialization in the order it does because the derived class' static constructor doesn't access any resources in the base class. If it did, the CLR would call the base class's type initializer before the derived class' static methods accessed any resource in the base class.
The rules for static initialization are simple: A class's type constructor executes before any of the members in that type are accessed. It might execute immediately before that first access, but it still happens.
Equally simple are the rules for field initializers and constructors for instances of a type. The field initializers are executed from the most derived class first to the ultimate base class last. The body of the constructors execute in the opposite order: from the ultimate base class to the most derived class.
Beware of Virtual Functions
The rules for initialization order have a simple goal: The designers of the C# language wanted to ensure that all member fields (both static and instance) are initialized before any code (other than a constructor) executes. The compiler enforces this in most cases. You can't call your class's instance methods in field initializers (even if those methods are declared in the base class). Furthermore, by the time your derived class constructor gets executed, the base class constructor has finished executing. Now that methods in the base class are legal, the base class portion of the object should be initialized completely.
However, there is one well-defined pitfall in this strategy: virtual functions. You should avoid calling virtual functions from inside constructors. Any virtual function call calls the method defined in the most derived class. I've modified Listing 2 to illustrate most of the possible pitfalls (see Listing 3). Executing the code in Listing 3 produces this output:
derived static - initializer was created
derived static - Ctor was created
derived instance - Initializer was created
base static - initializer1 was created
base static - initializer2 was created
base static - ctor was created
base instance - initializer was created
base instance - initializer2 was created
base instance - ctor was created
Contructing an object: A DerivedTrackType object
derivedInit has been created
derivedCtor is null
A BaseTracktype object
derived instance - Ctor was created
Calling ToString() from the base class's constructor invokes the version defined in the derived class. This means a virtual method in the derived class can be called before the derived class's constructor executes.
This is one of the few times where I'd recommend against bulletproofing your software and recommend instead that you avoid messing up code in other classes. Don't ever call virtual functions in your constructors. To do so is to subvert one of the most fundamental rules of object-oriented development: The constructor executes before any other method can be called. It's not reasonable to expect developers to check such basic assumptions before executing any virtual functions. Instead, don't break those assumptions.
The handful of examples I've covered illustrate how easy it is to take a serious misstep while initializing objects. I've covered the rules that Visual Studio follows when initializing objects, but you can follow a few rules of your own to ensure you don't foul things up in the way you create and initialize your objects.
The simplest rule is to initialize all your variables (both static and instance variables) using field initializers. Whenever possible, this is the best practice. Of course, it's not always possible.
Static initializers should be used only if there's no possibility that the initializer will throw an exception. If there's any chance that the right-hand side of an initializer will throw an exception, you should move it into the body of the static constructor and protect it with the appropriate try / catch block, then try to recover as best you can.
It's also important to remember that you can't always use field initializers for instance variables. Readonly fields are often assigned to parameters given to the constructor. Other instance variables can be set based on constructor parameters. That's perfectly reasonable, but when fields don't rely on constructor parameters, you should initialize the fields using the field initializer syntax instead.
We like to believe that objects either exist or don't exist, but that isn't always the case. Object initialization can involve quite a few methods. Until all those methods have been executed, you can't assume that an object has been completely initialized. You need to write constructors carefully, and be especially careful that you avoid calling virtual methods in your constructors.
About the Author
Bill Wagner, author of Effective C#, has been a commercial software developer for the past 20 years. He is a Microsoft Regional Director and a Visual C# MVP. His interests include the C# language, the .NET Framework and software design. Reach Bill at [email protected].