VSM Cover Story

Load Up With VB's Operators

Take an in-depth look at operators in Visual Basic; examine the rules and guidelines for operator overloading; and learn about ternary operators and operator lifting.

Technology Toolbox: VB.NET

I listened recently to a Channel 9 recording of Anders Hejlsberg, Brian Beckman, and Erik Meijer that caught me by surprise. Brian praised VB's dynamic features, yet said he used C# because it had operator overloading. You could be forgiven if you took that to mean VB doesn't have operator overloading; that's a common mistake. Even though operators are an integral part of coding they are often overlooked or misunderstood. Some of the best, most brilliant programmers have made mistakes in choosing which operator to use to join two strings, how to combine Enum values, or even the right division operator to use. In this article, I'll give you the low-down on operators in VB, including a basic walkthrough of the operators that exist, how to use them, and provide some general caveats that trip up even the biggest rock stars of VB.

One of the most common misunderstandings I hear is the claim that VB doesn't have operator overloading. In 2005, VB8 introduced full support for consuming and defining operator overloads (Table 1 has a complete list of overloadable operators). Operator Overloads refers to using operators such as these:

+ - / * = <>

Table 1 Meet Your VB Overloads!
A common misperception is that VB doesn't have operator overloading. In fact, it does, albeit you need to be careful about how you use this feature (and operators in general). This list details which operators can be overloaded, directly or indirectly; and which can't be overloaded at all.

In practice, custom operator overloading is rarely used because it usually has more drawbacks than benefits. For example, operators are static methods (Shared), which means they are resolved at compile time not runtime. This limits the polymorphic behavior. Also, they have poor discoverability. It isn't obvious what operators, if any, a type might have. Compared to the "dot" syntax for method calls, operators require a more intimate knowledge of the type.

You can classify the overloadable operators into three basic categories: mathematical and logical operators such as \, /, +, -, *, And, and Or; equality and inequality operators such as = and <>; and the conversion operator CType.

Generally, you should avoid defining custom operators in the mathematical and logical category, unless your type is similar to an intrinsic type. For example, you might use these operators for matrices, points, DBTypes, or simple structures. Operators are Shared, so it's preferable that your class be NotInheritable or a Structure. You should also ensure that you define instance methods with clear, straightforward names that expose the same functionality as the operators. For example, if you overload the + operator, you should also include a method named Add. This provides better discoverability and support for other languages that might not have operator overloading.

The equality and inequality operators are more useful, but require special attention. VB has two equality operators: Is and =. You use the Is operator to test whether the references are equal; that is, the Is operator lets you determine whether two variables point to the same object in memory (same heap location). You use the = operator to test whether the values of two variables are equal; the variables can have value equality, yet be two different instances.

Some languages have only one equality operator. For example, C# has only the == operator for equality testing. In C#, code such as if (a == b) forces a reference equality test (equivalent to Is in VB), as long as no overloaded equality operator has been defined; otherwise, C# performs a value equality test. You can force C# to do a reference equality test by casting each operand to an object:

if ((object) a == (object) b)

When working with reference types in VB, developers don't expect to use the = equality operator. The reason for this: Only a handful of classes in the framework actually overload the = operator, so developers are unlikely to use the = operator even if you provide it. For C#, the behavior of an overloaded equality operator could also be unexpected, resulting in a value test, not a reference test. Thus, you should avoid overloading the = and <> operators for reference types. Instead you should override the Equals method, the GetHashCode method, and consider implementing IEquatable(Of T).

When dealing with value types (structures) you should also override the Equals and GetHashCode methods, consider implementing IEquatable(Of T), and adding custom operators for = and <>. This code enables simple value testing of structures using code such as If a = b:

Shared Operator = (ByVal left As Foo, _
	ByVal right As Foo) As Boolean
	....
End Operator
	
Shared Operator <> (ByVal left As Foo, _
	ByVal right As Foo) As Boolean
	....
End Operator

You should include the IComparable(Of T) interface implementation if you decide you also want to define the < and > operators on your structure.

Casting and Conversion
VB has multiple operators for performing conversion and casting: DirectCast and TryCast permit inheritance, including interface inheritance-based casting. TryCast returns Nothing if the cast fails, whereas DirectCast throws an exception. The CType operator attempts inheritance-based casting where applicable, as well as conversions for intrinsic types and any type that defines custom CType operators. You cannot change the inheritance casting rules; however, you can define custom conversion rules.

There are two types of conversion: narrowing and widening. A narrowing conversion is one where there might be data loss or the conversion might fail. A widening conversion is one where the new type is wider and holds all the data without loss or failure. An Int32 to Int64 conversion is a widening conversion; an Int64 to Int32 operation is a narrowing conversion. Widening permits implicit conversion.

An example of a Widening operator is the conversion of a System.Drawing.Point variable to an instance of System.Drawing.PointF. The PointF structure stores the X and Y values as Singles (floats), whereas Point stores them as Integers. This means the conversion from a Point to a PointF is a safe, widening conversion:

Public Shared Widening Operator CType( _
	ByVal p As Point) As PointF
	Return New PointF(p.X, p.Y)
End Operator

Note that the operator in this case can be defined in either the Point or PointF type, but not both. It makes more sense to have the operator in the type being converted from, rather than in the type being converted to: the Point type in this case. Doing so makes it easier for the compiler to resolve the cases where you have implicit conversions, as the type being converted from is known, while the type being converted to needs to be inferred.

You should add constructors to the type that allow for the other data to be passed in, rather than forcing a conversion to be called. This provides a better way to implement the same functionality that a Widening operator provides but in a more discoverable way. For example, IntPtr accepts an Int32 in its constructor. Narrowing operators should be avoided because they can fail at runtime or result in potential data loss that isn't necessarily obvious. Instead of providing a narrowing operator, you should add a ToXXX method such as ToVector. For the most part, it's wise to avoid operator overloading. Adding equality and inequality operators to your Structures will help, and widening operators can also be useful, but generally these are the rare cases, not the mainstream practices. For more on operator overloading, I suggest you read Mathew Gertz's article online at http://tinyurl.com/39zjn2.

Inside Numeric Operators
The rest of this article will concentrate on how the normal operators work in VB.

All VB's intrinsic numeric types are reliant on the language to provide the operators. For example, they don't have an operator that performs addition such as op_Addition. There are a couple reasons for this. First, the number of overloads for op_Addition needed to support such an operator would be large and potentially confusing. For example, adding a double to an integer could be defined in the double or in the integer type, and both patterns, double + integer and integer + double would need to be defined. Add to that the performance overheads, and it's a better design for these operations to have intrinsic support at the IL level and to make languages map directly to those IL constructs.

You might be tempted to think the standard operators—addition (+), subtraction (-), multiplication (*) and division (/)—are self explanatory, but even simple operations like these on numeric types can behave differently in certain conditions. If both operands are in the integer family, then the default behavior is for the operation to throw an error if an overflow occurs. For example, 2147483647 * 2 typically produces an error because it's too large to fit in an Int32. But if you turn off overflow checking in the advanced compile options for the project, the result is -2, not an error.

In hex notation, 2147483647 is &H7FFFFFFF. For an Int32, &H80000000 is the sign bit (bit 31), so the maximum value an Int32 can hold is any value before that bit is set, which is &H7FFFFFFF. If you add 1 to &H7FFFFFFF, the value is 2147483648, but because that also sets the sign bit, the value is interpreted as -2147483648. Adding 1 to the number can cause the value to jump from a large positive value to a large negative value. This is unexpected behavior for most people, so you shouldn't turn off overflow checking unless you are sure you have dealt with all possible results.

VB also has a few less well known operators for numeric types: ˆ, Mod, and \. ˆ is the exponent operator, so 3ˆ2 is three squared, which equals 9. 4 ˆ 4 is four to the fourth power, which is 256. (Note that C# does not have an exponent operator; in C#, ˆ is the XOr operator.)

Mod is the modulus operator, which returns the remainder from any division:

result = operand1 Mod operand2. 

Operand2 is the divisor, so the result will be in the range from 0 to less than Operand2. For example, 5 Mod 3 returns 2, and 5.2 Mod 2.5 returns 0.2.

VB provides two operators for division: / and \. The normal division operator, /, returns the value as a double. For example, 5/2 returns 2.5. The \ operator performs integer division and returns the integer part of the result. This means that 5\2 returns 2. In C#, / is a division based on the operands, so 5/2 returns 2, but 5.0/2 returns 2.5. The VB way is more explicit, and hence, less surprising. If you are translating C# code to VB, where both operands are integers, use \ instead of /.

The basic operators— = (equality), <> (inequality), < (less than), and > (greater than)—behave as expected. When translating C# code to VB, == becomes =, and != becomes <>.

Work With Bits
When working with VB and bitwise operations, you'll probably notice a departure from the normal English usage. Let's say I have a bag of apples, and you ask me how many apples I have. I might answer, "1 Or 2." Would you expect that means "3"? Or, assume I peered into the bag and said, "4 And 8." You'd probably be thinking I have 12 apples right? Wrong!

In VB, And is a bitwise operation on integer types, so 4 And 8 returns 0. And is a bit masking operation; the value returned is the value of the bits where the coinciding bits in each operand are both set. 4 in binary looks a bit like 00000100, and 8 looks like 00001000. If you line them up with one under the other, you'll see there are no bits set in the same position, so the result is 0. The second operand works like a mask; only those bits can be returned.

The Or operator returns all the bits for both operands, so 1 Or 2 returns 3, 2 Or 2 returns 2, and 4 Or 8 returns 12:

4	00000100
8	00001000
12	00001100

One of the most common mistakes I see in code when working with integer values is the use of + where the developer meant to use Or. For example, 2 + 2 gives a value of 4, which actually excludes the original bits.

XOr is another bitwise operator, one you're not likely to use as regularly as the other operators. XOr returns the bit values where only one of the coinciding bits in both operands is set. Hence, 2 XOr 2 returns 0; 1 Xor 2 returns 3; and 1 Xor 3 returns 2. This exclusive Or behavior used to be one of the typical "how to tell if someone is a programmer" questions. The question often took a form like this: How do you swap the values in two variables without using a third variable. The answer is to do three XOr operations:

	i = j Xor i
	j = j Xor i
	i = j Xor i

There are other uses for XOr, apart from showing how devilishly smart you are, but they are few.

The bitwise negation operator in VB is Not. Each bit becomes a 1 if it was a 0, and vice versa (one's complement). For a signed integer, Not 0 returns -1 because -1 means all bits are set (&HFFFFFFFF). More typically, you use Not as part of bit masking to remove a particular value. For example, you can use this syntax to remove the sign bit from an integer value:

 j = i And Not &H80000000

You might have heard assertions to the contrary, but VB also has bit-shifting operators: left shift << and right shift >>. The shift operations are arithmetic. For left shift, this means the bits are moved to the left and the lower bits are filled with zeros. Each left shift is the equivalent of multiplying by 2 with overflow checking turned off. Right shift moves the bits to the right and fills the higher order bits with the same value as the sign bit. For positive integers, this is the equivalent of integer division by 2. For negative integers, shifting to the right by one bit is roughly the same as division by 2, although the result can vary by 1. If you continue to shift a negative value to the right, eventually all bits will be set as the higher order bits are filled with 1. The end result is -1.

Bit shifting is often useful to help mask or isolate particular values.

Use the Right Operators
Enums deserve special mention. Operators for Enums are the same as the Enum's underlying type: Short, Byte, Integer, Long, and so on.

However, the fact that you can use the underlying type's operators doesn't mean you should. Family and friends kindly inform me that my receding hairline is indicative of the early stages of male pattern baldness, but I know better. Too often, I've found myself pulling my hair out when I see developers use + or – with Enums, when they really should have used bitwise operations (sidebar, "The Do's and Don'ts of VB Operators").

Overloadable? Operators Special Instructions
Yes; directly +, -, *, /, \, &, ˆ, >>, <<, =, <>, >, >=, <, <=, And, Like, Mod, Or, Xor, IsFalse, IsTrue, Not, CType
Yes; indirectly AndAlso, OrElse. Overload IsFalse and IsTrue for use with AndAlso and OrElse
Yes; indirectly CBool, CByte, CChar, CDate, CDec, CDbl, CInt, CLng, CObj, CSByte, CShort, CSng, CStr, CUInt, CULng, CUShort. Overload CType to change these
Yes; indirectly &=, +=, -=, *=, \=, /=, ˆ=, <<=, >>= Overload the corresponding operator without the assignment. For example, overload & to indirectly overload &=
No AddressOf, DirectCast, Is, IsNot, TryCast, TypeOf
The Do's and Don'ts of VB Operators

Most developers—even accomplished developers—are susceptible to making mistakes when working with operators, not least because the way operators behave can change depending on the context. So, for example, using + to concatenate strings can work for you most of the time, but every so often, it can lead to unintended effects. This short list of do's and don'ts should help you manage your operators more easily and keep you from experiencing the aforementioned unexpected behaviors that are common to many Visual Basic applications.

DO use & for concatenation of strings.
DO provide well named methods alongside your operator overloads.
DO use Or to combine Enum values.
DO use And Not to remove an Enum or Integer value.
DO use XOr to toggle a bitwise value (Enums or Integers).
DO consider overloading value equality, =, in Structures.
DO use ( ) to break up complex statements and force precedence.
DO use hex notation for bitwise Enums.
DO mark bitwise Enums with the Flags attribute.
DO learn about and use operators wisely.
DON'T use + for string concatenation.
DON'T use + for combining Enum values.
DON'T use – for removing an Enum value.
DON'T overload operators in non sealed types.
DON'T go wild with operator overloading.

An Enum can be a single choice, or it can be multiple choice. For example, an Enum for Gender would be a single choice, whereas an Enum for FilePermissions might be made up of several bit fields. When you define an Enum that has multiple bit fields you should attribute it with the FlagsAttribute:

<Flags()> Enum FilePermissions
	Create = &H1
	Read = &H2
	Write = &H4
	Delete = &H8
	ReadWrite = Read Or Write
	All = -1
End Enum

The FlagsAttribute indicates that a valid value can consist of any combination of the Enum's values. If you wanted to declare a FilePermissions Enum that included Create, Read, and Write, you might be tempted to add them together. That approach would work, but it is incredibly fragile in this case. The 1 + 2 +4 would give you 7, the desired result, but what if any of those values overlapped? Assume you wrote:

Dim permissions = _
	FilePermissions.Write + 
                  FilePermissions.ReadWrite

The permissions granted would end up being Delete only, and you wouldn't be able to read or write! The correct operator to use when combining Enum values is Or:

Dim permissions = _
	FilePermissions.Write Or 
                   FilePermissions.ReadWrite

Note that I used Hex notation for the values in the Enum definition. This makes it easier to ensure that only the individual bits are set. Hex notion is 0 - 9 and A – F, providing 16 values for each character, which in turn means 4 bits per character. The bits when set are 1, 2, 4, and 8. So, with Hex notation, if you see any characters or values other than 0, 1, 2, 4, or 8, you know that the code is setting a combination of bits, and possibly should be rewritten. For example, the ReadWrite value could have been written as &H6, but it is safer to write that as Read Or Write, with both the Read and the Write values as single bits. One exception to my 0, 1, 2, 4, 8 only rule is when you want to add an All value. You could bitwise Or all the other values, but it's better to have the value allow for any future additions. To that end, you might set all bits with &HFFFFFFFF, which is -1 for an Int32.

When you need to determine if an Enum value includes a particular value, you use the And operator to bit mask the Enum value with that value in question. For example, assume you want to determine whether you have Write permissions. You bitwise And the value with FilePermissions.Write:

If myPermissions And _
	FilePermissions.Write = _
	FilePermissions.Write Then

Note that you be might be tempted to write that code by comparing the result to 0, either explicitly or implicitly:

' explicit comparison to 0
If myPermissions And _
	FilePermissions.Write <> 0 Then

' implied comparison to 0 
' requires Strict Off
If myPermissions And _
	FilePermissions.Write Then

This code will work in this case, but in other situations it might give false positives because the bitmask would give a non-zero value wherever the bits of the two operands are both set; it might be that the result returned wouldn't be the entire value you were looking for. For this reason, you should avoid comparisons with 0 when bit masking.

Drill Down on String Operators
VB provides concatenation operators and equality comparison operators for strings. For concatenation, you can choose between the + and & operators. The & operator concatenates strings and intrinsic types (all the numeric types, Booleans, and Dates) when Option Strict is On. For example, 2 & 3 returns the string "23".

The + operator will only concatenate when both operands are strings. If Strict is Off, and one operand is numeric and the other a string, the + operator will try to convert the string to a double. This can result in a runtime error. 2 + "3" will return "5", but 2 + "January" results in an error. 2 & "January" always returns "2January".

So, for concatenation, use the & operator and avoid the + operator.

The comparison operators for strings—=, <>, <, and >—typically result in a call to String.CompareOrdinal. I say "typically" because the code that gets called depends on your setting for Option Compare. By default, Option Compare is set to Binary. However, if you set Option Compare to Text, then the comparison operators use the current threads culture to do a culture-sensitive comparison, which ignores case. When you set Option Compare to Text, If "a" = "A" Then returns True; with Binary, it returns False.

Option Compare can be useful, but it can only be set at the module or application level. Binary is faster than Text and is more suitable for the default behavior. If you need case-insensitive comparisons, consider using the String.Equals or String.Compare methods instead.

Select Case blocks can also use the comparison, as dictated by Option Compare. If you set Option Compare to Binary, yet you want to do a case insensitive match, you can convert each string to upper case (avoid using ToLower).

Strings are reference types in .NET, so you can also compare strings' references to see if they point to the same instance. You do this using the Is operator. By itself, a reference comparison wouldn't be useful for strings because you would have to assign one variable to the other for them to point to the same memory location. But .NET includes a string intern table that typically holds all string literals declared in an application:

Dim a, b, c, d, e As String
a = "Monica"
b = "Mon"
c = "ica"
d = b & c
e = String.Intern(d)

From the code declaration, the intern table holds three strings: "Monica", "Mon", and "ica". The variable d holds the string "Monica", but because that was constructed at runtime, it's not in the intern table. Testing Is d returns False. For the variable e, the call to String.Intern finds "Monica" in the intern table and returns a reference to that string. Hence, e Is a returns True.

String interning can be useful when you need to compare a string to multiple other strings. If the strings being compared are all constants, they will be interned by default. You need to intern the string that you want to compare with the constant strings for the Is operator to work. Rather than intern strings that aren't a match, you use the String.IsInterned method. The IsInterned method returns a null reference if no match is found:

myString = String.IsInterned(myString)
If myString Is Nothing Then
	Return ' early exit
ElseIf myString Is "Bill" Then
 	...
ElseIf myString Is "Fred" Then
	...
ElseIf myString Is "George" Then
	...
End If

As of .NET 2.0, you can turn off string interning for assemblies that are NGen'ed by specifying the compiler relaxation attribute:

<Assembly: CompilationRelaxations 
	(CompilationRelaxations.NoStringInterning)>

In .NET 2.0, the String class was also modified so String.Empty is no longer interned. This means, as of .NET 2.0, If String.Empty Is "" returns False, whereas it returned True on earlier versions of .NET. Note the VS help topic documentation on this is incorrect. It has its example the other way round, which is impossible because it wasn't until .NET 2.0 that you could turn off string interning with NGen'd assemblies. In .NET 1.0 and 1.1 String.Empty was interned, in .NET 2.0 it isn't. The only safe way to test for String.Empty is to test whether the string's .Length = 0.

VB9's Not So IIf'y
VB9—the version of Visual Basic that will be found in Visual Studio 2008, previously codenamed Orcas—adds a new ternary operator, null coalescing, and operator lifting. The ternary operator is the equivalent of the ?: operator in C#:

name = IF(Customer Is Nothing, "none", Customer.Name)

It is referred to as "ternary" because it has three operands. It has this general form:

IF(<Boolean expression>, true result, false result)

This looks similar to the already existing IIF function in VB, but it behaves quite differently. The IIF function is a function, not an operator, so all the arguments passed to the function are evaluated. If you were to attempt the previous example using IIF, and Customer was Nothing, then the Customer.Name argument would throw a null exception. With the If operator, only the true part or only the false part get evaluated, not both.

VB9 also adds greater support for nullable types. Instead of defining a Nullable(Of Integer), you can define it as Integer? The ? on the end indicates that it's a nullable type. These two declarations are equivalent:

Dim x as Integer?
Dim x As Nullable(Of Integer)

When you are working with nullable types, you can use the If operator for null coalescing:

IF(nullable type, value if null)

For example, this code's x is assigned y's value, or 0 if y is null:

Dim y as Integer? = GetAValue()
Dim x As Integer = IF(y, 0)

The If operator used for null coalescing is the C# equivalent of the ?? operator.

Last but not least, VB9 will let you use the operators that are defined in the underlying type for nullable types. So, if you define x, y, and z as Integer?, you can write x = y + z and the + operator for integers will be used if both y and z are not null. If either y or z is null, then x is null. The good news: Working with nullable types in VB9 will be a lot easier than it is today.

Operators, including the new ternary IF, can make writing code a lot easier. As someone once said, "With great power comes great responsibility." It is up to you to use operators properly and be vigilant on their proper use. And keep in mind that others might not have the same knowledge as you, so try to keep the code readable. That probably means avoid overloading operators unless it is for an obvious use. Your code is going to be a lot more maintainable if you use a temporary Integer when swapping values, rather than the triple XOr trick. Even simple code like 2 + 3 * 4 is easy for people to mistake. Is that 20 or 14? Use parenthesis to make your code and your objective clear. 2 + (3 * 4) is the same thing, but it's better if no one has to fuss over precedence rules when reading your code. Focus on knowing the right operators you should use, when to use &, when to use +, when to use Or, and so on. Correct usage of operators does improve code readability and maintenance, and that is the final goal. And the nice thing is, operators can make getting there quicker and easier.

About the Author

Bill McCarthy is an independent consultant based in Australia and is one of the foremost .NET language experts specializing in Visual Basic. He has been a Microsoft MVP for VB for the last nine years and sat in on internal development reviews with the Visual Basic team for the last five years where he helped to steer the language’s future direction. These days he writes his thoughts about language direction on his blog at http://msmvps.com/bill.

comments powered by Disqus

Featured

Subscribe on YouTube