Practical .NET

Creating Complex XML Documents with XML Literals

If you're creating an XML document and want to write code that you'll actually be able to maintain … well, it may be time to learn a little Visual Basic, just so you can use XML Literals. It's a good solution even for C# shops.

The Microsoft .NET Framework includes multiple tools for generating XML documents, but the tools suffer from one failing: If the XML document is at all complicated then it's difficult to write code that's easy to read, easy to understand, and easy to modify or extend as requirements change. And, by "difficult," I mean, "impossible." By the time you write the code to create elements and attributes and then append them to each other, it's hard to see the document in the code. As requirements change and you need to alter the document, you're often reduced to altering the code and then running it to find out what the resulting document looks like.

There's a tool for creating complex XML documents that does give you readable, easily maintainable code: XML Literals. Unfortunately, XML Literals are only available in Visual Basic. But if you're a C# programmer, don't stop reading yet. There are three good reasons for using XML Literals even if you're a purely C# shop. First, in my opinion, the benefits of using XML Literals are sufficiently worthwhile that they'll offset the costs of reading a language with which you're not completely familiar. Second, as you'll see, there isn't much Visual Basic code required: Most of what goes into your code looks like XML rather than Visual Basic. Finally, while XML Literals are Visual Basic-specific, the XElement that you'll create in this code is completely interoperable. So, even in a pure C# shop, creating a Visual Basic class library that exploits XML Literals and returns an XElement to your C# application is a perfectly reasonable option.

The closest you can get in C# to XML Literals is to pass a literal string to the XElement Parse method. However, concatenating together strings is a poor substitute for XML Literals which give you many of the benefits you associate with writing code: IntelliSense to prompt you for your closing tags and Visual Studio to flag validation errors as you type in your code, for example. XML Literals also provide a clean way to integrate variable data into your XML document.

The Beauty of XML Literals
The most obvious example of the benefits of XML Literals is the ability to generate a simple document: You just declare a variable as an XElement and paste in a sample of your XML document to initialize the variable (you'll need a reference to System.XML.LINQ to use XElement). This example uses an XML document with a default namespace and single root element containing some text (notice the complete absence of any quotation marks or any other delimiter, other than what XML requires):

Dim elm As XElement 
elm = <Customer xmlns='http://www.phvis.com/Customer'
                xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'>
        A123
      </Customer>

The document is completely visible in your code: You can literally see the document you're generating. If requirements change and, for example, the data is now to be enclosed in a CustomerId element, you can just add the new element to the document embedded in your code:

elm = <Customer xmlns='http://www.phvis.com/Customer'
                xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'>
        <CustomerId> 
          A123
        </CustomerId>
      </Customer>

One caveat: Much of the space you see inside the CustomerId element in this code will be included in the resulting XML document. You'll probably want to omit that whitespace by writing the document like this:

elm = <Customer xmlns='http://www.phvis.com/Customer'
                xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'>
        <CustomerId>A123</CustomerId>
      </Customer>

Inserting Simple Data
Of course, a real document will have variable data in it. My data inside the CustomerId element will probably change from one document to another, for example. To insert a value into an XML Literal you just need to add an expression that will evaluate to the right value.

To flag an expression to be evaluated in a document created with XML Literals, you must wrap the expression in a set of delimiters familiar to ASP.NET developers: <%= …. %>. Assuming the value to go inside my root element is held in a variable called CustId, the code that would insert that value would look like this:

elm = <Customer xmlns='http://www.phvis.com/Customer'
                xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'>
        <CustomerId>
          <%= CustId %>
        </CustomerId>
      </Customer>

Unlike my previous hardcoded example, with this code, the whitespace inside the CustomerId element will be omitted in the XML document. As with the original documents, as requirements change, altering the expression is both obvious and easy to do.

There is, however, one significant limitation to the code you can insert into an XML Literals document. While ASP.NET supports multiple delimiters that go beyond just integrating expressions, XML Literals does not. The only thing you can include in your XML Literal is an expression: A Visual Basic statement that evaluates to a single value. Fortunately, expressions turn out to be all you need.

For example, if you need to do more complex processing to generate the value, you can call a method that returns a string to supply the value in the element (and that function could be in a C# library). This example assumes that I have a method called GetCustomerId in a class called CustomerManagement that, when passed a customer name, will return the value that's required in the document:

Dim cm As New CustomerManagement
elm = <Customer xmlns='http://www.phvis.com/Customer'
                xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'>
        <CustomerId>
          <%= cm.GetCustomerId(CustName) %>
        </CustomerId>
      </Customer>

The same pattern works with attributes. This example inserts a value into an attribute (XML Literals will supply the quotation marks around the value):

elm = <Address Type=<%= Address.AddressType %>/>

Handling Optional Elements and Attributes
Of course, real XML documents aren't as simple as my previous example. Typical schemas for XML documents include optional elements, flagged in the schema with a minOccurs attribute set to 0. In this example, the Addresses element can contain two elements, BillingAddress and ShippingAddress (in that order). The BillingAddress must be present, but the ShippingAddress is optional and can be omitted:

<xs:complexType name="Addresses">
  <xs:sequence>
    <xs:element ref="BillingAddress"/>
    <xs:element ref="ShippingAddress" minOccurs="0"/>
  </xs:sequence>
</xs:complexType>

XML Literals provide two ways to handle optional elements. One solution is to include an inline expression that generates the element with its content when the optional element is required and skips the element when it isn't. The tool to use in your inline expression is the IIf function ("Immediate If") -- the C# equivalent is the ?...: operator. The IIf function allows you to specify a test and a value to return when the test is true. The IIf function also allows you to provide a value to return when the test is false (the false part is usually optional, but, depending on the complexity of your code, the compiler might insist that you provide one).

This IIf example, for example, returns an element with a hardcoded value when the property HasShippingAddress is true:

<%= IIf(cust.HasShippingAddress = True,
  <ShippingAddress>53 St. Patrick Street</ShippingAddress>, Nothing) %>

This wouldn't be much help except for another feature of XML Literals: You can nest expressions within other expressions. Upgrading the previous example to return the value of a ShippingAddress property just requires adding an expression that returns the ShippingAddress property inside the ShippingAddress element (I've also eliminated the redundant test against True):

<%= IIf(cust.HasShippingAddress,
  <ShippingAddress><%= cust.ShippingAddress %></ShippingAddress>, Nothing) %>

Integrating this code into an XML document would give code like this:

Dim cust As New Customer("A123")
elm = <Addresses>
        <BillingAddress>
          <%= cust.BillingAddress %>
        </BillingAddress>
        <%= IIf(cust.HasShippingAddress,
          <ShippingAddress><%= cust.ShippingAddress %></ShippingAddress>, Nothing) %>
      </Addresses>

Personally, I like this solution because the code for including or omitting the element is right in the XML document. However, I can see that some of the readability that I value in XML Literals has been lost. Furthermore, if an optional element contains other optional elements, the resulting nested IIfs can create code that even I would admit is difficult to read.

An alternative solution that avoids that danger is to include the optional element in its own XElement as this code does:

Dim ShippingAddressElement As XElement
If cust.HasShippingAddress Then
  ShippingAddressElement = <ShippingAddress>
      <%= cust.ShippingAddress %>
     </ShippingAddress>
Else
  ShippingAddressElement = Nothing
End If

XML Literals automatically ignore XElements set to nothing when generating the document. As a result, this example will omit the ShippingAddress from the document when ShippingAddressElement is set to Nothing:

elm = <Addresses>
        <BillingAddress>
          <%= cust.BillingAddress %>
        </BillingAddress>
        <%= ShippingAddressElement %>
      </Addresses>

In addition to optional elements, a document can have optional attributes. Here, I think that the most readable/maintainable solution is to use a XAttribute variable that holds a complete XML attribute (both the attribute name and its value). This example creates an XAttribute variable by passing a name and value to the XAttribute constructor when there's data present in the AddressType property of the CustomerAddress object pulled from a Customer object's Addresses collection:

Dim TypeAttribute As XAttribute
Dim Address As CustomerAddress
Address = cust.Addresses.First
If String.IsNullOrWhitespace(Address.AddressType) = False Then
  TypeAttribute = New XAttribute("Type",  Address.Type)
Else
  TypeAttribute = Nothing
End If

Now I can include the XAttribute in the document. As with XElement, if the XAttribute variable is set to Nothing it will be omitted form the document:

elm = <Address <%= TypeAttribute %>/>

Repeating Elements, Dynamic XML Elements and XML Directives (Oh My!)
In addition to optional elements, XML documents usually contain repeated elements. In a schema, these are elements with a maxOccurs attribute greater than 1. This example specifies that an unlimited number of Address elements can appear within an Addresses element:

<xs:complexType name="Addresses">
  <xs:sequence>
    <xs:element ref="Address" maxOccurs="unbounded"/>
  </xs:sequence>
</xs:complexType>

To generate the equivalent document using XML Literals, you use a LINQ statement (which, you'll notice, will be almost identical in both C# and Visual Basic except for the uppercase letters). The LINQ statement in the following code adds to the document an Address element enclosing the value of the Address property for each CustomerAddress object in the Addresses collection:

elm = <Addresses>
        <%= From addr In cust.Addresses
            Select
        <Address Type=<%= addr.AddressType %>>
          <%= addr.Address %>
        </Address> %>
      </Addresses>

A typical result will look like this:

<Addresses>
  <Address Type="Shipping">
    53 St. Patrick Street
  </Address>
  <Address Type="Billing">
    163 Shore Drive
  </Address>
</Addresses>

Of course, LINQ will only work with homogenous collections (collections guaranteed to contain only one kind of class). However, as I discussed in a previous tip you can use the OfType and Cast methods to convert heterogeneous collections into collections with which you can use LINQ.

You're not limited to just adding data to your document's elements and attributes. Where it makes sense you can also use expressions to dynamically set the names of your elements and attributes. This example dynamically generates an element whose name is pulled from a variable called ElementName and which contains an attribute whose name is drawn from a variable called AttributeName:

TypeAttribute = New XAttribute(AttributeName, "Value")
elm = <<%= ElementName %> <%= TypeAttribute %> />

While a potentially useful feature, I can't help but feel that this is throwing away some of the clarity that XML Literals provide.

If you want to go one step higher than an element and include the XML directive in your document, you can do that also by using XDocument instead of XElement as this example does (again, note the absence of any delimiters):

Dim xdoc As XDocument
xdoc =  <?xml version="1.0"?>
        <Customer xmlns="http://www.phvis.com/Customer"
                  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
          A123
        </Customer>

Using the Document
Now that you've created your XML document in your XElement, what can you do with it? At the start of this article, I suggested putting this code in a method in a class library that you can call from the language of your choice. The most obvious interoperable choice is to have that method return the XML document as a string. You can do that just by calling the XElement ToString method:

Public Sub GenerateXML(cust As Customer) As String
  Dim elm As XElement
  '…create document 
  Return elm.ToString
End Sub

When you call the ToString method from C#, any double quotes (which is what the XElement puts around attribute values) will be escaped.

However, while XML Literals are limited to Visual Basic, the XElement is not (it's just another .NET data type). That means there's nothing stopping you from returning your XElement from your method:

Public Sub GenerateXML(cust As Customer) As XElement
  Dim elm As XElement
  '...create document 
  Return elm
End Sub

In addition to using the XElement ToString method to retrieve the XML, the calling program can save the XML to disk using the XElement Save method:

elm.Save("c:\MyXML.XML")

So, here's my point: We all know that more time is spent extending and modifying existing applications than is ever spent creating them (in most shops, 75 percent of development time is spent on existing applications). Creating applications that are easy to read, extend and modify is critical to any development team's success. For my money, if you have to create an XML document in code, you should use XML Literals -- regardless of what language you prefer to program in.

comments powered by Disqus

Featured

  • AI for GitHub Collaboration? Maybe Not So Much

    No doubt GitHub Copilot has been a boon for developers, but AI might not be the best tool for collaboration, according to developers weighing in on a recent social media post from the GitHub team.

  • Visual Studio 2022 Getting VS Code 'Command Palette' Equivalent

    As any Visual Studio Code user knows, the editor's command palette is a powerful tool for getting things done quickly, without having to navigate through menus and dialogs. Now, we learn how an equivalent is coming for Microsoft's flagship Visual Studio IDE, invoked by the same familiar Ctrl+Shift+P keyboard shortcut.

  • .NET 9 Preview 3: 'I've Been Waiting 9 Years for This API!'

    Microsoft's third preview of .NET 9 sees a lot of minor tweaks and fixes with no earth-shaking new functionality, but little things can be important to individual developers.

  • Data Anomaly Detection Using a Neural Autoencoder with C#

    Dr. James McCaffrey of Microsoft Research tackles the process of examining a set of source data to find data items that are different in some way from the majority of the source items.

  • What's New for Python, Java in Visual Studio Code

    Microsoft announced March 2024 updates to its Python and Java extensions for Visual Studio Code, the open source-based, cross-platform code editor that has repeatedly been named the No. 1 tool in major development surveys.

Subscribe on YouTube