Ask Kathleen
Visual Studio's T4 Code Generation
Learn how to create and debug templates using Microsoft's T4 templating language.
Q: I want to generate some of the code for my application and a friend of mine said that Microsoft has a code generator, but I can't find one. Do you know what he's talking about?
A: Microsoft actually has two major code-generation tools -- the CodeDOM and T4. Your friend was probably referring to T4 because the CodeDOM is so difficult to read and maintain that it's inappropriate for almost all business applications. T4 (previously called T3) stands for Text Transformation Templating Toolkit. T4 is an ASP.NET-style syntax that has direct literal text output, embedded expressions and code logic.
T4 is already on your machine if you're using Visual Studio 2008, or you can download the DSL Toolkit if you're using VS 2005. T4 can generate any type of textual artifact. That's good news. The bad news is that the support is not complete and Visual Studio is missing critical capabilities. But there's more good news: adding two free tools to your environment gives you a solid base for generating code. I'll show you how to create and debug a T4 template and three ways to run T4 templates, including a Managed Extensibility Framework-based harness, which acts as an ecosystem for evolving code generation.
T4 templates begin with a template directive. Directives are identified by the <#@ #> tag. In addition to initiating the template, the template directive indicates the language of the template and allows debugging. T4 templates include .NET code and the language attribute indicates the language of code running in the template, not the artifact you're creating. By default, T4 uses the 2.0 version of the language, but you can change this by appending "v3.5" to the language attribute.
Following the template directive, you can include any combination of literal text (in grey below) and code and expressions within statement blocks identified with the <# #> tags:
<#@ template language="C#" debug="true" #>
<#
int hour = DateTime.Now.Hour;
if(hour < 12)
{#>
Good Morning World!<#
}
else if(hour > 12 && hour < 17)
{#>
Good Afternoon World!<#
}
else
{#>
Good Evening World!<#
}#>
In this sample template, statement blocks indicate which greeting to output. VS doesn't provide coloration, which makes it extremely difficult to read the templates. I recommend downloading an editor from Clarius Consulting. Clarius has both a free community edition and a more sophisticated retail edition here.
This is enough background to create your first T4 template. Open VS and create a new project of any type. Add a New Item and select the type Text File. Name the file with the extension .TT. Enter the code in the previous sample or some variation. VS maps the .TT extension to a custom tool named TextTemplatingFileGenerator. This custom tool provides a default mechanism for running simple templates and learning about T4. When you save the template, the custom tool runs, calls the T4 engine and outputs code to a dependent file. When you expand the plus sign on the template file, you'll find the output with a .CS extension containing Good Morning World!, Good Afternoon World! or Good Evening World!
If you use a VB project, you might be surprised that the extension of the generated file is .CS. The custom tool defaults to .CS, but you can fix this by specifying the extension in an output directive:
<#@ output extension=".vb" #>
The T4 engine first parses the template to create a temporary class derived from the TextTransformation class. This class is compiled and run to produce output. The code you put inside statement blocks is placed in the body of the TransformText property in the temporary class. The literal text of your template (shown in grey) is included in calls to the Write method of a string builder within this temporary class. You can debug by placing breakpoints inside your template and stepping through the code. You'll step through your template, not the compiled temporary class. While debugging, you can access any data available to the template. In addition to any fields and properties you declare, you can access base class. Members of the base class allow you to control indentation (PushIndent, PopIndent, ClearIndent, CurrentIndent), report issues (Error, Warning, Errors) and directly access the string builder (Write, WriteLine, GenerationEnvironment).
Code within statement blocks is run, but is not treated as output. Often you'll need to include calculated or retrieved fields. You can do this using the expression block, differentiated by <#= #>. Also, because the code of your template becomes code in a method, you'll need a different type of block to include additional methods or additional classes. The class feature block offset by <#+ #> offers this capability. You can write the template above in a different way to demonstrate these features:
<#@ template language="C#" debug="true" #>
<#
int hour = DateTime.Now.Hour;
if(hour < 12)
{#>
Good Morning World!<#
}
else if(hour > 12 && hour < 17)
{#>
Good Afternoon World!<#
}
else
{#>
Good Evening World!<#
}#>
The templates you use to create application code will be a smooth integration of three elements: statement blocks to guide template logic, expression blocks for output, and class feature blocks for methods and classes to facilitate reuse.
In addition to the template and output directives, T4 natively provides four other directives. The import directive provides namespace recognition similar to Imports in Visual Basic or the using statement in C#. The include directive places the contents of another T4 template into the current template. The assembly directive adds .NET assemblies for the template code, similar to adding a reference in VS. Using the default tools, assemblies must include the full file path (from the root directory), be in the GAC or be referenced by the current project. If the include directive uses a relative path and the included template contains additional include directives, all relative paths are in relation to the first or root template location, making "double jumps" for code reuse difficult.
You can play with templates using VS's automatic generation, but you're likely to soon run into limitations: Code reuse is hampered by putting the template directly into your output directory. You can't pass input parameters. It's a pain to write .NET code without IntelliSense and syntax checking. It hopelessly intertwines the output project with the generation mechanism, which breaks the Single Responsibility Principle and Separation of Concerns. I prefer to move out of VS and focus on self-contained templates running in a harness, which separate the action of generation from the activity of the generated application. If you wish to work within the VS model, the T4 Toolbox on CodePlex takes the VS custom tool usage to amazing lengths and Oleg Sych's blog has good coverage of T4 intricacies. Another option working inside VS is to write your own custom tool for generation. (See "Generate Code from Custom File Formats," March 2009, and the September 2008 Ask Kathleen column, "Customize Code Generation in EF.") Custom tools can provide a T4 wrapper and use an alternate T4 host to solve some issues, such as assembly and include file location resolution. However, you'll still face challenges as you stuff your generation process inside your application project.
T4 Harness
I prefer to write an independent generation harness that can run multiple templates to build your app and is focused on reuse. I'll explain in broad terms how a T4 harness works, and how you can download a working harness with source code.
If you're using a separate harness, you don't want the default VS behavior of creating the dependent file because your template will rely on features not present and fail. The easiest way to block this behavior is to give your templates a .T4 extension. The Clarius editor will still recognize your templates with this extension.
To provide extra functionality, such as passing parameters, the generation harness uses a custom T4 host. The host must implement ITextTemplatingEngineHost and may also implement IServiceProvider. The host is responsible for all interactions with the environment: managing links to custom directive processors, finding assemblies and include files, and providing services. By creating your own custom host, you can provide extra functionality and data, such as a dictionary of parameter names and values.
Creating custom directives allows you to expand the information the template author provides. For example, the template author can provide a list of parameters. Within the template, parameters are treated as properties, so by convention parameters are declared using a custom directive named property (see Listing 1). Including explicit parameter requests in the template leads to better encapsulation.
Custom directives require a directive processor, which inserts the required code into the temporary class when called by the engine. Custom directive processors inherit from DirectiveProcessor. Its ProcessDirective method is called once for each directive. This method creates the code required by the directive and the engine inserts this text into the temporary class. The property directive processor creates a .NET property for each directive. The backing data for the property is retrieved by casting the host to an IServiceProvider that offers the PropertyDictionary.
Code Generation and MEF
If that sounds like a lot of gnarly code -- it is. To make it easier for you to use T4, I've created a T4 harness based on the Managed Extensibility Framework (MEF).
[Click on image for larger view.] |
Figure 1. The ExportContainer of MEF acts like a schoolyard where different players can raise their hands and shout "I need a xxx" or "Use my xxx." When teamed with intelligent granularity, this creates an ecosystem for evolution of individual parts without disrupting the whole.
|
Code generation and MEF are a perfect fit. Templates become MEF parts that are discoverable simply by being placed in a directory pointed to by a configuration file. You can add or remove templates just by altering the contents of this directory.
Metadata, output services, and even the property directive processor are MEF parts. The pieces of the harness are like Lego blocks -- pick the ones that make sense in your environment. The purpose of the harness is to empower a creative and flexible ecosystem that expands as your needs change.
I've glossed over a few details that are worth returning to. MEF parts must be .NET objects, and T4 templates are just text files until they're executed. Each T4 template must be wrapped in an instance of a class that is a non-shared MEF part and the harness needs to perform discovery. Because discovery is directory based, the harness just needs to grab the T4 templates in a particular directory, place them in the MEF part wrapper and ensure MEF can find the part. Discovery itself is an MEF part, so if you'd rather use a different discovery scheme such as a script, feel free.
While I've talked about the interactions between a host parameter dictionary and a property directive processor, I haven't explained how the host knows what parameters to expect or where to get the values. Because a T4 template is just a text file, the MEF wrapper can crack it open, parse, and extract a list of expected parameters from the property directives. Property directives are used twice -- to create a dictionary of parameters and to generate code for the temporary T4 class.
Because MEF creates an ecosystem, the wrapper does not need to explicitly find values for the parameters; it merely asks the MEF container first for any match on the parameter name, then for a match on the type. To avoid unnecessary dependencies, use the name only for simple types such as string or date, and base other matches on the data type.
While I've only discussed property directives, you can create any other type of directive you want. The directive processor is retrieved via MEF, and I've provided a base class which makes creating new directive processors very easy.
Code Generation Principles
|
As I started working on code generation, I began uncovering underlying principles. No one meets them all, but these are the goals we're striving for:
- Code generation must be in your control. If something goes wrong with your application, you're responsible for fixing it. You can't fix something you can't change. You need control of the templates behind your code generation.
- Metadata is distinct and morphable. Metadata is the data that drives your application and makes it unique from other applications. It needs to be distinct so you can find and debug it, and it needs to be morphable so your database and business objects aren't required to match.
- Code generation should fit into your development process. If you're doing nightly builds, code generation should be part of it, otherwise it should be a simple one-click process. Generated code should be included in source control as all other code.
- Handcrafted code is sacred and protected. People are creative and may not do the same thing the same way a second time, so you should have a place for handcrafted code in the design (derived or partial classes) and protect against accidental change.
- Generated code is of extremely high quality. One of the
greatest benefits of code generation is high-quality architectures that can evolve to meet changing demands.
- Designing an application based on generation should be easy. T4 templates and a simple ecosystem-based harness is one step in this direction.
-- K.D.
|
Generation often requires that templates run multiple times with different data to create multiple files. It would break the Single Responsibility Principle for the template itself to define looping behavior. The harness solves this problem by searching for parts that export a special ILoopValues interface with a custom MetadataView attribute that provides the loop type. If a parameter is a type supported by a loop value provider, the T4 template is called once for every item in the collection.
Because many files are output, each needs a unique name. To make this easy, the harness provides a special parameter named TemplateOutputFileName. Assign a name based on the data in each pass of the template.
The generation harness assumes you have control and know what code is running during generation. MEF parts and T4 templates run code and can therefore run malicious code. You need to ensure the integrity of the parts and the T4 templates you place into your environment.
I'm releasing my T4 harness with full source code for you to work with and explore. You can download AppVenture Community Generation Harness with source code from the download section at www.appventure.com and you can read more about its ongoing development in my blog. Our intention is to make the project open source as soon it stabilizes and we start building a community. If you're interested in helping out as we move toward that goal, let me know.