Code Focused

It's All About Character in C# and Visual Basic

How C# and Visual Basic handle characters and single-character strings is a bit inconsistent. There's no tidy solution, but it's something you do need to know.

Characters are the size zero of the .NET text data universe. Even though they're built from the same basic D, N, A as strings, their ability to hold just one character at a time is a bit of a handicap. This might explain why Visual Basic and C# are inconsistent in how they treat characters and single-character strings.

I discovered these inconsistencies the hard way, when trying to migrate a Luhn validation routine from Visual Basic to C#. The Luhn algorithm is a checksum process that ensures the validity of a string of digits, such as the digits in a credit card account number. Part of that algorithm adds up alternate digits:

For position = accountNumber.Length - 1 To 2 Step -2
  sumDigits += CInt(Mid(accountNumber, position, 1))
Next position

The loop extracts one digit at a time from the original numeric string, and totals the numeric values of those digits. C# lacks an intrinsic equivalent for the Visual Basic Mid function. However, you can extract a single character from a source string by treating the string as an array:

char oneCharacter = originalString[position];

It seemed quite natural to use this character selection method in the converted C# Luhn logic, making the appropriate adjustments to account for the Mid function's 1-based positioning system:

for (int position = accountNumber.Length - 1;
     position >= 2; position -= 2)
  sumDigits += (int)accountNumber[position - 1];

Imagine my surprise when perfectly valid credit card account numbers were suddenly identified as problematic. The failure stems from differences between single-character strings and true-character instances, and between Visual Basic conversions and C# casts. Whereas the Visual Basic code converts the string "1" to an actual numeric value of 1 (one), the cast of the '1' character in the C# code produces the ASCII value of 49.

The anticipated solution is to use single-character strings in C#, just as was done in the Visual Basic code:

sumDigits += (int)accountNumber.Substring(position - 1, 1);

This, of course, doesn't work. The integer cast in C# cannot be used from a string data source. Instead, the correct solution requires that you abandon native C# conversion tools in favor of .NET methods:

sumDigits += Convert.ToInt32(accountNumber.Substring(position - 1, 1));

The Visual Basic CInt function doesn't permit conversions from its Char data type to Integer for precisely this reason: It doesn't know if you prefer ASCII values or a literal translation of the digit in the original string. But C# allows this, expecting you to have an innate understanding that ASCII values will jump out of the cast. By all means, make sure you are born with this understanding. Or test your conversions -- that might work, as well.

About the Author

Tim Patrick has spent more than thirty years as a software architect and developer. His two most recent books on .NET development -- Start-to-Finish Visual C# 2015, and Start-to-Finish Visual Basic 2015 -- are available from http://owanipress.com. He blogs regularly at http://wellreadman.com.

comments powered by Disqus

Featured

Subscribe on YouTube