New Age C++
C++ Templates: Not a General Case of Generics
Diego Dagum provides an overview of C++ templates, and how its generics differ from C# and Java.
C++ templates are all about generic programming, but not the type of generics you may be familiar with from C# and Java.
This is the first in a series on C++ generic programming. I'll start with an overview of C++ templates and how they relate to equivalent concepts in managed code. In subsequent installments, I'll focus on the templates' unique aspects -- those with no counterpart in Java and .NET. Many of those topics, like Template Meta-programming, have been the subject of entire books. They're harder to assimilate; that's a possible reason they're not found in Java and .NET.
Templates have been part of C++ since its creation. The first C++ compilers performed a translation into plain old C code before starting compilation. Consequently, so did C++ generic logic. Today, C++ compilers skip C in their way to object code, but generic logic is still resolved before compilation.
In its basic form, a template class is a class that depends on one or more template parameters (T, S, etc.):
template<typename char_t>
class basic_string {
public:
basic_string();
basic_string(const basic_string&);
basic_string& append(const basic_string&);
… /* other template class members */
private:
char_t* buffer_;
...
}
Before declaring variables based on templates, however, I must substitute template parameters with concrete types (like char_t). This is known as template instantiation:
basic_string<char> str1, str2; // string composed by ASCII chars.
basic_string<wchar_t> unicode_str; // string composed by Unicode chars.
It sounds like managed code, right? But it's not.
A template class is never compiled. It's held as a stencil, like a sophisticated #define macro, to be completely pasted with definitive types during instantiation:
// a possible compiler expansion produced by basic_string<wchar_t>
class basic_string_wchar_t {
public:
basic_string_wchar_t();
basic_string_wchar_t(const basic_string_wchar_t&);
basic_string& append(const basic_string_wchar_t&);
… /* other template class members */
private:
wchar_t* buffer_;
...
}
When the compiler produces object code, no traces of the templates remain. That's definitely not the case with Java and .NET. Let's review generics in those platforms to better understand what that difference means.
Java Generics
Java lacked of generics until version 5. Java generics are less powerful than their C++ counterparts, but very simple to apply.
Java generic classes are compiled into bytecodes, assuming java.lang.Object for all type parameters (or, alternatively, a more specific class constraint). This is known as "type erasure". The compiler guarantees type checking and generates code that properly casts all access:
// simplified representation of generic vector
public class Vector<T> { // T becomes java.lang.Object
private T[] data; // storage
public void add(int index, T elem) { data[index] = elem; }
// if v1 is declared vector<String>, the compiler casts the
// result of v1.get(i) from java.lang.Object to String
public T get(int index) { return data[index]; }
... // other vector members
}
Java generics don't accept primitive types, but their wrapped versions (java.lang.Integer, and so on) instead. However, thanks to auto-boxing/unboxing, developers hardly notice:
Vector<Integer> vint = new Vector<Integer>();
vint.add(5); // the compiler produces vint.add(new Integer(5));
C# Generics
The .NET Framework 2.0 introduced generics. There isn't type erasure in .NET, but reification instead. This creates a specific class whose interface is the one that best adheres to the type being instantiated. This looks closer to C++, although reification happens at runtime in C#.
Unlike Java, there isn't runtime casting in the generic class client code. Conversely, it generates Common Intermediate Language (CIL) for each of these classes, though equivalent methods share a single CIL implementation.
C# generic classes accept primitive types for instantiation.
C++ doesn't produce object code for a generic class, but individual versions for each template instantiation. Neither hidden runtime castings, nor auto-boxing/unboxing before primitive types are required. These makes C++ the fastest implementation. The downside is that, depending on the number of distinct template instantiations and the size of the produced code each produces, it could lead to object code bloat.
No Template Parameter Constraints
In Java and C#, I can narrow a generic class to certain types:
// In Java, I can limit instantiations to some hierarchy
public class MyGeneric<T extends my.other.ClassOrInterface> {
...
}
// In C# as well
public class MyGeneric<T : my.other.ClassOrInterface> {
...
}
This way I can access to my.other.ClassOrInterface public members, provided that these members are available as a consequence of the constraint. But if I really need such constraint, is a generic class needed at all?
public class MyNonGeneric {
private my.other.ClassOrInterface data; // or array of, etc.
...
public MyMethod(my.other.ClassOrInterface param) {
// access to private data or param through
// my.other.ClassOrInterface their public API
}
}
C++ template syntax doesn't consider parameter constraints. It could be reproduced by combining <type_traits> and static assertions:
template<typename T>
class my_generic {
// the following is resolved at compile time. Should it fail,
// the string below is shown and compilation ends.
static_assert(std::is_convertible(T, my.other.class),
“my_generic must be instantiated with my.other.class or subclass.”);
... // rest of my_generic declaration and/or definition.
}
my_generic<my.other.class> p; // ok
my_generic<int> q; // error
But again, if you need a template parameter to be of a given type, a non-generic type, like MyNonGeneric above, might solve your problem.
Undeliverable: Return to Sender
One of the issues raised from including template code as a stencil is when the substituted code doesn't compile.
The consequent compile errors are shown to the template consumer rather than its producer. The problem with this is that the error message makes reference to template definitions, which the consumer isn't supposed to know. Consequently, the template consumer will have to dig into template internals to fix the error.
These are the kinds of things that can make C++ large-project development less productive than managed code.
Implicit vs. Explicit Template Instantiation
Many managed-code developers believe that C++ template headers must contain function definitions, not just declarations, as they get linker errors because definitions weren't available. They don't want to deliver template definitions for intellectual property reasons, among others.
What they don't know is that they can overcome the issue with explicit instantiation:
#include <my_generic_type.h>
// my_generic_type method definitions
...
// explicit instantiation for certain type parameters
// (the compiler will produce object code for these).
template class my_generic_type<char>;
template class my_generic_type<giraffe>;
The resulting object code can be delivered together with a header containing only declarations, so linkage won't break. It works, although it narrows template instantiation to just those explicitly-instantiated choices.
Function Templates
Java and .NET discarded C-like functions because they didn't meet the object-oriented paradigm. C++ kept them because C++ added object-oriented syntax to C, which already had them.
Consequently, template concepts are also applicable to mere functions. STL <algorithm> is a great example:
// extracted from STL <algorithm> header
template<class InputIterator, class Predicate>
InputIterator find_if(InputIterator First, InputIterator Last, Predicate Pred);
template<class ForwardIterator, class Generator>
void generate(ForwardIterator First, ForwardIterator Last, Generator Gen);
If you want to check production-ready examples of C++ templates, STL should be your first stop. Class and function templates examples are found in smart pointers, collections, threading, algorithms -- basically, everywhere.
C++ Templates: Get Ready for More
I just checked you in at the C++ Templates lodge. You got a review on template features that have some equivalence in Java and C#. In future installments I'll show powerful template features not available in managed code. One likely reason for templates in C++ is because they're resolved at compile time. Just come in, understand the differences with managed-code generics
and take advantage of them!
About the Author
Diego Dagum is a software architect and developer with more than 20 years of experience. He can be reached at [email protected].