Code Focused
Beautify Your Code With Extensions
Extension methods bring together old and new ways of working with data, and open the doors to new language opportunities.
- By Bill McCarthy
- 05/01/2007
Technology Toolbox: VB .NET
Extension methods are a new language feature that will become available with Visual Basic 9.0 (VB Orcas).
The great part about these methods: They give you enormous flexibility when working with objects of all types. Combined with LINQ, these methods give you the power to keep query statements simple, yet use powerful case-specific filtering techniques. I'll walk you through how to put these extension methods to work in your applications, explaining both their design and usage, as well as how they integrate into Language Integrated Query syntax (LINQ). Before I get into the specifics, however, let's review the historical and current capabilities of VB for context.
In VB6, you can use the Len function if you want to know the length of a stri ng. The Len function is a Shared method in the VBA.Strings function library. When ported to the .NET platform, this functionality was moved into the Microsoft.VisualBasic.Strings function library, and it remains essentially the same: a Shared method that returns the length of a string or 0 for null references.
With the introduction of .NET, Strings became objects, not just data holders. As such, the String type now comes with methods as part of the object instance, including a Length property.
The old model used function libraries; the new model uses instance methods. Both have advantages and disadvantages when compared to the other. Function libraries allow you to use different methods based on how you want to use the data; this is similar to how banks process checks differently, depending on their needs and practices. Function libraries also allow you to deal with null references gracefully. On the other hand, the object-orientated approach offers great discoverability through the "dot" syntax, as well as providing the basis for run-time polymorphic behavior in non-sealed (inheritable) types.
Extension methods are static (Shared) methods, which are essentially the same thing as function libraries, but with one important twist: Extension methods appear through the dot syntax on variables of the type of the first parameter (or derived from it). The dot-syntax behavior of an extension method makes it appear as if it is an extension of another type. This ability to "extend" other types comes in handy when you need to add behavior to NotInheritable (sealed) classes such as String. For example, you can create an extension method that returns a Nullable(Of Int32) for a String's length:
Public Module MyExtensions
<Extension()> _
Public Function NullableLength( _
ByVal value As String) _
As Nullable(Of Int32)
If value Is Nothing Then
Return Nothing
Else
Return value.Length
End If
End Function
End Module
You use this extension method by calling it as you would an instance method; the instance is passed automatically to the method's first parameter. The extension method appears in Intelli-Sense for the String type (Figure 1).
The extension method is a Shared method, so you can also call it directly by passing the string to it. In other words, the extension syntax is equivalent to the Share method syntax:
' extension syntax
nlength = myString.NullableLength
' shared method syntax
nlength = NullableLength(myString)
In fact, that's exactly how the call gets compiled, as if you had called the Shared method and passed the String to it.
Key Advantages
The main benefits in calling the method through the "dot syntax" are discoverability and readability. You don't need to look up or know the name of function library methods; you can use IntelliSense to help you write your code. The dot syntax allows you to express things in a logical order, from left to right. For example, assume you have an Encrypt function and a NullableLength function, and you want to encrypt a string, and then get its length. Extension methods let you do this easily:
nlength = myString.Encrypt.NullableLength
Compare this syntax to what you had to do previously:
nlength = NullableLength(Encrypt(myString))
The method is compiled by passing the preceding expression in as the first parameter, which means the preceding expression is evaluated. This is a pleasant change from the Shared method syntax, where the preceding expression isn't evaluated. For example, the Shared method syntax lets you call the Shared IsNullOr-Empty method of String using the type's name syntax without any side effects:
String.IsNullOrEmpty(myString)
In VB, you can also call the method through an instance:
myCustomer.GetFullName.IsNullOrEmpty(myString)
Not only is the use of a Shared method confusing in this example, but the method GetFullName is never called. Fortunately, the VS 2005 compiler throws a warning about dangerous use of Shared members through an instance:
warning BC42025: Access of shared member,
constant member, enum member or nested type
through an instance; qualifying expression
will not be evaluated.
Extension methods don't suffer from this problem when called through the instance because the preceding expression is evaluated when it becomes the first parameter.
You can declare the first parameter for an extension method as being passed in ByVal or ByRef. For value types, this makes a lot of sense. Consider an increment method added to integer:
<Extension()> _
Public Function Increment(ByVal i As Int32) As Int32
Return i + 1
End Function
Calling this method requires you to assign back to the variable:
myInt = myInt.Increment
Making the argument ByRef requires only that you call Increment on the integer to have the value updated:
<Extension()> _
Public Function Increment(ByRef i As Int32) As Int32
i = i + 1
Return i
End Function
The problem you now face—a problem illustrated by the example discussed so far—is that the API can be confusing. In the Increment example, you are updating both the value of the reference for i, as well as returning a copy of that value on the stack. This is necessary to support a single statement calling, myInt.Increment, while also supporting expression statements, such as when calling another method:
SomeMethod(myInt.Increment)
Unfortunately, you need to consider the trade-offs of flexible calling versus design clarity when working with value types.
The use of ByRef isn't usually recommended when working with reference types, with the exception of factory type methods and methods on immutable types. Consider an Encrypt method you might want to add as an extension to String. String is immutable, so making the extension parameter ByRef would allow you to update the variable to point to the new encrypted string.
You should generally avoid using ByRef, except when working with value types and immutable types. Another case where it might be appropriate to choose ByRef is when writing extension methods for use with delegates, since delegates can be thought of as immutable. Choosing an extension method enables you to simplify the process of combining delegates considerably:
'Combine delegates
m_Changed = DirectCast( _
[Delegate].Combine(m_Changed, value), _
EventHandler(Of PropertyChangedEventArgs))
'Combine delegates with extension methods
m_Changed.AddDelegate(value)
Using extension methods with delegates is relatively straightforward (Listing 1).
Polymorphism and Precedence
It's important to remember extension methods are static (Shared) methods. This means the method resolution must occur at compile time, not runtime (Listing 2). Also, note that the code inside method Test is always resolved by calling the extension that expects a type A, even when a derived type is passed to it.
If you have multiple extension methods of the same name, precedence rules determine which method is resolved to at compile time. The compiler will first match to the most specific match for a given overload resolution. If there is still a conflict, then the method that is closest to the code by namespace is used. Assume you want to add this code to Listing 2's Module1:
<Extension()> _
Public Sub Print(ByVal obj As A)
In this case, the code you just added will be called, but not the code in the other module.
The idea here is that extension methods are designed to let you work with objects as you need to. This flexibility is in addition to respect for object-orientated rules, not a replacement for them. Thus, the integrity of an object has precedence over any extension method. For this reason, if an object defines a method of the same name as an extension method, it Shadows that extension method, even if the extension method is a more specific match.
LINQ (Language Integrated Query) is based around the use of extension methods. Each clause in a LINQ query is a call to an extension method. For example, assume you want to create a simple LINQ statement that returns a filtered enumeration of a Customers list with no duplicates:
Dim Noduplicates = _
From c As Customer _
In myCustomers _
Select c Distinct
You could create the same without using LINQ by using the extension methods directly:
Sub NoLINQ()
Dim fc As New Func(Of Customer, Customer) _
(AddressOf x_GetCustomer)
Dim Noduplicates As IEnumerable(Of Customer) = _
System.Linq.Enumerable.Select _
(Of Customer, Customer) _
(myCustomers, fc) _
.Distinct
End Sub
Function x_GetCustomer(ByVal obj As Customer) _
As Customer
Return obj
End Function
If you look carefully, you can see the two methods being called, Select and Distinct. This is the equivalent to the previous simple LINQ statement. Given this knowledge, and the knowledge you have in regard to precedence rules for extension methods, this means you can write your own extension methods to replace the default ones normally used in a LINQ statement.
For example, assume you have a Customer class that filters a collection to get a Distinct list. If the author of the Customer class either didn't override both the Equals and GetHashCode functions, or didn't set them to the values you want to test equality on, then the LINQ Distinct extension method won't give you the results you want. You can change this by adding your own Distinct extension that has a higher precedence over the System.LINQ.Enumerable.Distinct extension method (Listing 3).
If you recall the rules around precedence, you might be wondering whether methods in your collection classes might Shadow LINQ statements. The answer to this question is that instance methods will shadow by name and signature any extension methods. In the March CTP, VB behaves differently (sidebar "Out of the Shadows").
Shadowing by name and signature in this case refers to methods that would resolve to the same if both were instance methods. Remember extensions are Shared methods with the first parameter the instance: Hence, when determining if a method is shadowed by an instance method, the first parameter in the extension method is removed from the actual signature.
For example, a Select extension would be shadowed by an instance method, as long as the instance method has the correct and matching signature.
' inside a module
<Extension()> _
Public Function [Select](_
Of TSource, TResult) _
(ByVal source As _
IEnumerable(Of TSource), _
ByVal selector As Func( _
Of TSource, TResult)) _
As IEnumerable(Of TResult)
....
End Function
' instance method inside the collection class
Public Function [Select]( _
Of TResult)_
(ByVal selector As Func( _
Of T, TResult)) _
As IEnumerable( _
Of TResult)
....
End Function
The source parameter is removed from the signature for the extension method, generic parameters are resolved, and return types are ignored to determine if the instance method shadows the extension. If they match, the instance method gets precedence.
The difficulty in using instance methods over extension methods for LINQ queries is that you need to chain return types with types you have your instance methods in. Taking the earlier example of selecting a distinct list, the Distinct method must be inside the type returned by the Select method. In that particular example, you can have the Select return the same collection class and put your Distinct method in there. This becomes more complex when dealing with other projections and anonymous types. Rather than use instance methods, you'll find it generally easier to import a module with your extension methods in it.
Extension methods can make creating customized LINQ queries, easier, and easier to share, between your different libraries or collection classes. They are powerful, yet incredibly simple in essence; used wisely, they'll make your coding significantly easier.
About the Author
Bill McCarthy is an independent consultant based in Australia and is one of the foremost .NET language experts specializing in Visual Basic. He has been a Microsoft MVP for VB for the last nine years and sat in on internal development reviews with the Visual Basic team for the last five years where he helped to steer the languageās future direction. These days he writes his thoughts about language direction on his blog at http://msmvps.com/bill.