Inside the TFS Databases (An Occasional Series) -- Visual Studio Magazine

Inside the TFS Databases (An Occasional Series)

Get to know the ins and outs of TfsWorkItemTracking in the first part of this series covering the most useful aspects of the TFS database.

By Jeff Levinson
03/09/2009

Based on many of the posts out there, it seems that a number of you are interested in the underlying Team Foundation Server (TFS) databases. The problem with this is that they're not documented and there are no relational constraints (or very few) so you can't figure it out that way, and there seems to be a lot of confusing data. On top of that, when you customize the process templates, it can get even more confusing.

Microsoft doesn't provide support for end users with regard to the database -- and for good reasons. It also doesn't guarantee forward compatibility (which is why we never recommend making changes to the existing database tables...ever). So as a warning, look but don't touch.

(Seriously, do not make any changes to the existing schema and do not ever edit the data manually. Doing so can corrupt multiple records. This article is for informational purposes so you can create reports from the relational data and look at the data directly and understand what you're looking at.)

However, if you need to write custom reports, you sort of have to understand the underlying data -- especially because not everything is reported to the data warehouse. In addition, maybe you need instant verification through reporting and the data warehouse is updated only every hour (at minimum) or longer depending on your environment. In that case, there's a good reason to write reports off of the transactional databases. In this series, I'll examine the pieces of the TFS databases that I hope you'll find most useful.

In this first part I'll spend some time in the TfsWorkItemTracking database. This database holds all of the information regarding, as you might guess, work items. There are a number of reasons why you may want to report off of this database. First, it contains the long text fields such as the description field or any other HTML type fields there may be. As an example, we'll use a simple bug that I created (Figure 1). As you can see, the bug is in the proposed state and has some basic information with it (the bug number is 389 -- remember this as this is the key in the database).

In the TfsWorkItemTracking database, run the following set of queries with SQL Server Management Studio (as another warning, if you're going to do this, create a log-in with read-only permissions only; this will prevent you from doing something accidentally):

Use TfsWorkItemTracking

SELECT	*
FROM	[TfsWorkItemTracking].[dbo].[WorkItemsAre]
WHERE	ID = 389
  
SELECT	*
FROM	[TfsWorkItemTracking].[dbo].[WorkItemsLatest]
WHERE	ID = 389
   
SELECT	*
FROM	[TfsWorkItemTracking].[dbo].[WorkItemsWere]
WHERE	ID = 389

This will return you data from the WorkItemsAre and WorkItemsLatest tables but no data from the WorkItemsWere table. The WorkItemsAre and WorkItemsLatest tables are virtual mirrors of each other, but you'll notice in Figure 2 that there's a key difference in the results: the Revised Date (which, as far as I can tell, is never used in this particular table) in the WorkItemsLatest table. In general, you can pick either table from which to return the latest version; I use the WorkItemsLatest table.

If you look at the data in the WorkItemsLatest table, you'll find almost all of the data (we'll cover what you don't find in a minute) with cryptic column names like Fld10027 with the value "By accident, duh." Obviously, this field name isn't helpful, so how do you resolve that to an actual field name? By using the Fields table, of course. Running this query will return you the name of the column "How Found":

SELECT	FldID,
	Name
FROM	Fields 
WHERE	FldID = 10027

To actually return the query with all of the valid names as column headers requires a fairly complex query. This structure gives the process template its flexibility. As you add fields, these fields are mapped to new columns in the work item tables and entries are made in the Fields table. This is just one reason not to fool with the tables.

The WorkItemsWere table stores changes to the work items. If I go back and make a change to the Bug (I've activated it and assigned it to "Steve") and rerun the first set of queries, I'll now have the results shown in Figure 3. As you can see, the WorkItemsWere table now contains my original entry (the work item in the proposed state) with a revision number of 1. The WorkItemsLatest and WorkItemsAre tables contain the current version of the work item. As you make more changes, the historical versions of the work items will be noted in the WorkItemsWere table. It's important to note that the Revision number (Rev column) is a key value. The primary key for any given version of a document is the ID + Rev number.

Now, about those other fields that weren't stored here. If I go back into bug 389 and make a change to the Symptoms value and save it, you'll note another revision in the work items tables, though nothing has really changed. That's because HTML fields are stored as ntext fields and are in a different table -- the WorkItemLongTexts fields. If you run the query below

SELECT	*
FROM	WorkItemLongTexts
WHERE	ID = 389

you'll get the results shown in Figure 4. There are a couple of things about this to note, the first of which is that you don't know which fields these apply to. Again, this is based on the fields table. If you update the query in Figure 4 so that it's joined with the Fields table, you'll get more readable results:

SELECT	a.AddedDate,
	b.Name,
	a.ID,
	a.Words,
	a.[Changed Order],
	a.Rev
FROM	WorkItemLongTexts a
join	Fields b on a.FldID = b.FldID
WHERE	a.ID = 389

Again, one of the keys to note here is the Rev number. Note that it says 1 and 3 for the Symptom field. There is no Rev 2 because no changes were made to this field during the revision 2 (when I changed the work item to Active and reassigned it to Steve).

In this article, I've briefly shown you some of the key tables in the TfsWorkItemTracking database. In the next article in this series I'll show you how to map work items to projects and get complete records of any given version, and I'll examine some of the very helpful views available to you.

If you have specific things you would like to know about the database, feel free to drop me an e-mail at [email protected] and I'll try to address them in a future article.

About the Author

Jeff Levinson is the Application Lifecycle Management practice lead for Northwest Cadence specializing in process and methodology. He is the co-author of "Pro Visual Studio Team System with Database Professionals" (Apress 2007), the author of "Building Client/Server Applications with VB.NET" (Apress 2003) and has written numerous articles. He is an MCAD, MCSD, MCDBA, MCT and is a Team System MVP. He has a Masters in Software Engineering from Carnegie Mellon University and is a former Solutions Design and Integration Architect for The Boeing Company. You can reach him at [email protected].

Printable Format

comments powered by Disqus

Featured

New 'Visual Studio Hub' 1-Stop-Shop for GitHub Copilot Resources, More

Unsurprisingly, GitHub Copilot resources are front-and-center in Microsoft's new Visual Studio Hub, a one-stop-shop for all things concerning your favorite IDE.
Mastering Blazor Authentication and Authorization

At the Visual Studio Live! @ Microsoft HQ developer conference set for August, Rockford Lhotka will explain the ins and outs of authentication across Blazor Server, WebAssembly, and .NET MAUI Hybrid apps, and show how to use identity and claims to customize application behavior through fine-grained authorization.
Linear Support Vector Regression from Scratch Using C# with Evolutionary Training

Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the linear support vector regression (linear SVR) technique, where the goal is to predict a single numeric value. A linear SVR model uses an unusual error/loss function and cannot be trained using standard simple techniques, and so evolutionary optimization training is used.
Low-Code Report Says AI Will Enhance, Not Replace DIY Dev Tools

Along with replacing software developers and possibly killing humanity, advanced AI is seen by many as a death knell for the do-it-yourself, low-code/no-code tooling industry, but a new report belies that notion.
Vibe Coding with Latest Visual Studio Preview

Microsoft's latest Visual Studio preview facilitates "vibe coding," where developers mainly use GitHub Copilot AI to do all the programming in accordance with spoken or typed instructions.