Ask Kathleen
Understand Your Code Better
Visual Studio Team System's Code Metrics feature isn't perfect, but if you understand what it's measuring and how, you can use it to gain insight into your apps' overall complexity and to spot potential problem areas.
Technologies mentioned in this article include the Entity Framework, ADO.NET, VB.NET, and C#.
Q I'd like to use the Code Metrics of Visual Studio Team System (VSTS) to explain to management the degree of complexity of an application I've been handed recently. But I don't understand the meaning of these values. At a minimum, I thought I understood what "lines of code" means, but the number reported is considerably smaller than what I see in the files, and it undercounts compared with how my group measures other projects.
A The code metrics in VSTS can be one tool for determining application complexity. Unfortunately, the reporting leaves a bit to be desired (see Figure 1). You can export to Excel or filter in Visual Studio. Filtering in Visual Studio is hampered by the UI, which makes it difficult to focus on problems in individual methods.
The columns available to you are maintainability index, cyclomatic complexity, depth of inheritance, coupling, and lines of code. The maintainability index gives you a general idea of complexity, and it's the only value where higher numbers are better. It's also the only one that displays the weighted average of results (as opposed to the sum of results), in the display. You can find problem areas in an application quickly by looking for the lower values, especially those methods that aren't green. In evaluating new code, this lets you bypass the decent code and drill right into the code most likely to present problems later. If you filter on maintainability index, you can enter a maximum value to display. If you enter a low value such as 30, only items with values below that limit are displayed. Code metrics always displays using a tree, so you're likely to see namespaces and types with considerably higher numbers. Use the asterisk (*) to expand the entire tree and see the methods with lower maintainability.
You can also export the data to Excel through a button in the code metrics output. The Excel data is not organized in a hierarchy and has different identifying columns. A column named scope indicates whether the row is a namespace, type, or member. You can sort on this (or "the scope") column, followed by another column to find individual members with maintainability or other issues.
Cyclomatic complexity is related to the number of paths in your application. Rather than calculating the actual discrete paths, the Team System algorithm returns the number of branches, plus one. For example, a Select Case with three options and a set of three independent If statements both report a cyclomatic complexity of 4 in VSTS. The Select Case does have four paths (assuming no default) the code can execute; however, the three independent If statements actually result in eight paths through the code:
Private Sub Test4(ByVal val1 As String, _
ByVal val2 As String, ByVal val3 As String)
If val1 = "Fred" Then
' Do Something
End If
If val2 = "Joe" Then
' Do Something
End If
If val3 = "Sam" Then
' Do Something
End If
End Sub
The cyclomatic complexity figure given by VSTS doesn't give a precise count of code paths, but it does give you a sense of code complexity, and it's related to the number of unit tests you should run on the application to achieve full coverage. This code also illustrates one of the ways code coverage can be misleading. Full logical coverage of this routine would require eight tests plus boundary conditions, instead of four tests plus boundary conditions suggested by the cyclomatic complexity. You could achieve 100 percent code coverage with a single test with the parameters "Fred," "Joe," and "Sam," but this contrived case wouldn't represent full logical coverage. The number of possible branches goes up by the number of choices (2) raised to the power of the number of times choices are made (3). This means that increasing the number of independent variations in a routine raises its complexity much higher than the cyclomatic complexity indicates. Team System displays the sum of this value, rather than offering the more useful approach of showing the max value for any method. Cyclomatic complexity should remain low unless the method is performing a complex task.
Inheritance hierarchy reports the depth of inheritance. Along with class diagrams, this report can help you find the complex parts of your architecture. Unfortunately, it also reports the depth of the .NET hierarchy, so the depth in classes that are inherited from .NET Framework classes appear higher than is meaningful from a maintenance perspective because you really don't care about the depth of the .NET hierarchy.
Coupling is a rough indication of how many times another class is touched. But it's only marginally useful because it's dependent on the detailed structure of your code and it reports total touches, not total touches to independent classes. For example, a For loop across an indexer often results in a lower coupling number than a For Each loop. The For Each can be better code, particularly if the loop is across a LINQ query expression and the query is evaluated multiple times in the indexed For loop. It's worth looking at this measurement, but only as a pointer to a better understanding of coupling within your app.
The values reported in lines of code will seem quite low, as you pointed out. This value doesn't reflect declarations, and it reports lines of code based on the IL produced. The discrepancy between the lines of code in the editor and the lines of code reported is greatest for well-written code that has been refactored into small discrete methods. The number of lines of code reported within your application helps you identify and remedy large methods, but the number VSTS reports isn't useful for comparison with an existing line count retrieved through another mechanism, such as performing an inspection in an editor. To avoid comparing apples to oranges, be sure to look at the lines of code from the same algorithm when measuring the raw size of your applications.
If you run Code Analysis (the VSTS version of FxCop) to catch quality issues in your code, it will review code metrics and raise issues when values fall outside expected bounds. It reports a problem if the maintainability is less than 65 or the cyclomatic complexity is more than 25 for any single routine or an inheritance depth greater than four. You don't necessarily need to refactor this code, but you do need to understand why these metrics occur. If the code is acceptable, mark the issue explicitly so it's ignored in later static analysis evaluations.
When looking at an existing project, I'd suggest searching out the methods with lowest maintainability and largest number of lines first. You can branch into the other evaluations as you work to understand the complexity of this existing code. Code Metrics is just one tool in your arsenal; I also recommend that you use Code Analysis, class diagrams, and the solution and class explorers to evaluate the project you've inherited.
Q I expect LINQ to return null for an aggregate if there are no elements that match the condition. However, this LINQ statement throws an exception: "Sequence contains no elements." What's going on here?
var customers = new Customer[] {
new Customer("Joe", "Smith"),
new Customer("Jack", "Jones") };
var q= from c in customers where
c.FirstName == "Sarah" select
c.Credit ;
var q2 = q.Max() as Nullable<Int32>;
A IEnumerable<TSource>.Max is a generic method. The CLR declaration looks like this:
public static TSource Max<TSource>(
this IEnumerable<TSource> source
)
Note that the return value is the same type as the IEnumerable. In your example, the IEnumerable is the Credit property, which is defined as an Int32. Thus, the inferred return value of the Max method is an Int32. When LINQ evaluates the list and finds no elements, it can't determine the Max value. It also can't return null because it's typed as an integer. It doesn't help to define max as a nullable Int32 or use the "as" operator because this casts to nullable after the problem has already occurred.
There are several ways to fix your code. Your query can return an IEnumerable of a Nullable<Int32>, which changes the return value of Max to Nullable<Int32>:
var q= from c in customers where
c.FirstName == "Sarah" select
c.Credit as Nullable<Int32>;
var max = q.Max();
If you want a list of matching customers for other reasons, or the list is passed into your subroutine, you can shift the selection of the Credit method into a lambda expression by using a different overload of Max. There's an error in the documentation, and this overload is missing from the IEnumerable<T> page:
public static TResult Max<TSource, TResult>(
this IEnumerable<TSource> source,
Func<TSource, TResult> selector
)
Func<TSource, TResult> is a delegate that you can define through a lambda expression. You need to state the return type as nullable by casting the delegate explicitly to avoid problems with empty lists. Because the delegate returns a nullable, .NET infers the return type of Max() as Nullable<Int32> and selects the correct overload:
var q= from c in customers where
c.FirstName == "Sarah" select c;
var maxValue = q.Max((
Func<Customer, Nullable<Int32>>)(
cust => cust.Credit));
You can also fix the problem by avoiding type parameter inference. Do this by stating the result type argument explicitly in your call to the same overload:
var maxValue =
q.Max<Customer, Nullable<Int32>>(
cust => cust.Credit);
The same exception occurs in VB when the list is empty. The first two solutions I mentioned look similar, but the latter approach, avoiding type inference, has a slightly different look. VB and C# infer type arguments for generic extension methods differently. VB does a two-pass inference. The first pass resolves any type parameters that can be resolved based on the arguments you pass to the method. The type parameters that can be inferred from the method arguments are effectively removed from the type parameter list. You cannot state a type that is inferred from method arguments, which simplifies the call. For this Func overload, the first type parameter is the argument passed to the delegate. It can be inferred, so it can't be stated explicitly in VB; this means you state only the last type parameter explicitly:
Dim q = From c In customers _
Where c.FirstName = "Sarah"
Dim maxIndex = q.Max( _
Of Nullable(Of Int32))( _
Function(c) c.Credit)
In VB, you can also use the aggregate syntax to collapse the selection and the request for Max into one statement:
Dim q5 = Aggregate c In customers _
Where c.FirstName = "Sarah" _
Into Max(CType(c.Credit, _
Nullable(Of Int32)))
Both VB and C# let you use the ? shortcut to indicate the nullable type. However, I prefer the more verbose syntax because I think it's nice to see the nullable type declarations stand out, and I find the question mark easy to overlook.
Q I upgraded to SQL Server 2008 without detaching the SQL Server 2005 databases I wanted to move forward. My install of release candidate 0 wasn't smooth -- I installed 2008 intending to run it side-by-side with SQL Server 2005, but later uninstalled SQL Server 2005. We have our main production databases backed up, but there was some development stuff that I don't have SQL backups for (although I do have the .MDF files backed up). My development operating system is Windows Vista. Am I hosed? I get an error when I select the file to attach:
CREATE FILE encountered operating system error 5(Access is denied.) while attempting to open or create the physical file 'C:\Users\MyName\Current Projects\Database\Test_Data.MDF'. (Microsoft SQL Server, Error: 5123)
A What you're encountering here is your typical, run-of-the-mill Vista User Account Control (UAC) issue. Try running Management Studio as an administrator; the same behavior will almost certainly be present in the release to manufacturing version of SQL Server 2008.
Q I have one machine -- and only one machine -- with a bizarre problem. When an application opens the OpenFileDialog, it crashes with a stack overflow error in Windows.Forms.dll. I found the location where the problem occurs through some rather painful tracing. I don't know why this happens only on this one machine. It also doesn't help that the user of this machine is a Linux fan who isn't happy that our company runs on Windows; he has been exceedingly obnoxious in pressing for a resolution to this problem. Do you have any ideas?
A Find out what utilities are on the machine. In particular, there's a utility that provides Linux-like behavior for Windows named True X-Mouse Gizmo. It lets you copy and paste with the mouse, focus without clicks, and adds other Linux-style mouse behavior. Unfortunately, it does not play well with .NET, so Gizmo and .NET spiral into a contest over who has control of the mouse or window in these dialogs until the stack overflows. A machine with this problem will also fail on a simple WinForms test project that calls OpenFileDialog, SaveFileDialog, FolderBrowserDialog, ColorDialog, or FontDialog. You can also test this theory through the UsExDialog property of the PrintDialog. With this property set to true (default), the dialog should work. If you set the property to false in a test app, you will see the failure.
Whether this is a .NET issue -- or such bizarre behavior by the utility that .NET can't be blamed -- is interesting only in theory. In practice, the current version of .NET won't play well with some rogue utilities, and such utilities must be uninstalled from the machine. If the problem is this or a similar utility, uninstalling the utility fixes the problem. Similarly, if you can't identify the utility and the user does rebuild the machine, ensure the user tests your app before and after adding any utilities as he or she rebuilds his or her preferred environment.
Q I want to use reflection to retrieve MethodInfo and call Invoke on a method in a dynamic part of my system. I have an array of parameters that I'm passing to the method and will use when I invoke MethodInfo. What is the simplest way to extract an array of types from it, so I can find the correct overload when I call GetMethod?
A LINQ offers a great deal of power when managing any collection, including the array of objects you're already passing to the method. You can write a simple method that returns the types present in any array of objects:
Public Function GetTypesFromParameters( _
ByVal ParamArray params() As Object) _
As Type()
Dim q = From o In params _
Select t = o.GetType()
Return q.ToArray()
End Function
It's easy to overlook some of the ways you can use LINQ, but if you have an IEnumerable available, it's often a good idea to consider whether you can solve your problem using LINQ. IEnumerables include arrays, collections, lists, dictionaries, stacks, hashsets, and many other .NET Framework features. LINQ features an expressive, easy-to-follow syntax that helps other developers understand your intent when you code anything that supports IEnumerable(Of T).
Q I'm switching to VB from C# and I really miss the ? : operator. Do you have any suggestions for working around it? I've had trouble using the IIf function because it evaluated both arguments.
A The ? : operator is called a ternary operator. VB received a ternary operator and a null coalescing operator in version 9.0 (.NET Framework 3.5). The operator for both is If. When the If operator has three operands, the first must be Boolean. When the first parameter evaluates to True, the second parameter is returned, and when it evaluates to False, the third parameter is returned. The second two parameters can be of any type:
Dim bool = True
Dim t1 = If(bool, "TrueValue", "FalseValue")
When the If operator has two parameters, the first must be a reference type. When the first parameter evaluates to Nothing, the second parameter is returned. Otherwise the first parameter is returned:
Dim cust As Customer = Nothing
Dim t2 = If(cust, New Customer("", ""))