Code Focused

Overcoming Escape Sequence Envy in Visual Basic and C#

C# might be more elegant with escape sequences, but that doesn't mean Visual Basic is weaker in this area.

Sometimes you just need to escape. I'm speaking, of course, about escape sequences in strings. Standard C# strings have them in spades: \t for tabs, \r\n for line breaks, \u1234 for who-knows-what random Unicode character. Meanwhile, Visual Basic developers spend their days concatenating one special character after another onto their boring strings.

Consider something as simple as a notification that includes multi-line, tab-indented bullet items. C# allows the special bullet characters and the less-special line breaks and tabs to be incorporated into the strings themselves. But in Visual Basic, such treats must be attached by brute force:

// ----- Lean and mean C# version
MessageBox.Show(
  "To apply changes:\r\n" +
  "\u2022\tSave your work\r\n" +
  "\u2022\tRestart the program.");

' ----- Plodding Visual Basic version
MessageBox.Show(
  "To apply changes:" & vbCrLf &
  ChrW(8226) & vbTab & "Save your work" & vbCrLf &
  ChrW(8226) & vbTab & "Restart the program.")

Such Visual Basic code works just fine, of course. But it lacks the fluidity of the complex strings found in C#. Fortunately, there are ways to improve string processing in Visual Basic, up to and including the use of true escape sequences.

One way to reduce the level of concatenation is to use the String.Format method, integrating the special characters as if they were data values:

MessageBox.Show(String.Format(
  "To apply changes:{0}" &
  "{1}{2}Save your work{0}" &
  "{1}{2}Restart the program.",
  vbCrLf, ChrW(8226), vbTab))

Using String.Format in this way makes the core content much cleaner, but the special characters are still external to the content. Interpolated strings, new in Visual Basic 2015, allow these characters to be brought directly into the strings themselves:

' ----- Only in Visual Basic 2015 and beyond
Dim bulletPoint As Char = ChrW(8226)
MessageBox.Show(
  $"To apply changes:{vbCrLf}" &
  $"{bulletPoint}{vbTab}Save your work{vbCrLf}" &
  $"{bulletPoint}{vbTab}Restart the program.")

If readability is your goal, this is a great option. But there is still the issue of escape sequence envy, which doesn't go away easily. Fortunately, Visual Basic can handle strings with true escape characters, albeit with the help of the .NET regular expression library. The Regex.Unescape method accepts a string with backslash-powered escape codes, and spits out string content with the sequences properly morphed into their intended characters:

' Assumes: "Imports System.Text.RegularExpressions"
MessageBox.Show(Regex.Unescape(
  "To apply changes:\r\n" &
  "\u2022\tSave your work\r\n" &
  "\u2022\tRestart the program."))

Using Regex.Unescape in this way doesn't provide an exact match for what's available in C# strings. In fact, it provides more, because the Unescape method processes one escape code not found in C#: The \c sequence. When followed by a letter of the alphabet, this sequence inserts that "control code" into the string. For example, \cI inserts a Control-I, better known as the tab character, into the result:

MessageBox.Show(Regex.Unescape(
  "To apply changes:\r\n" &
  "\u2022\cISave your work\r\n" &
  "\u2022\cIRestart the program."))

Now which language has escape-sequence envy?

About the Author

Tim Patrick has spent more than thirty years as a software architect and developer. His two most recent books on .NET development -- Start-to-Finish Visual C# 2015, and Start-to-Finish Visual Basic 2015 -- are available from http://owanipress.com. He blogs regularly at http://wellreadman.com.

comments powered by Disqus

Featured

Subscribe on YouTube