To Parse or Not To Parse -- Visual Studio Magazine

DevSmart

To Parse or Not To Parse

How to live a C++ life in an XML world.

By Stephane Raynaud
06/15/2008

Extensible Markup Language (XML) is everywhere, from configuration files for everyday software applications to advanced information-exchange technologies. Even car navigation systems use it. So how can native coders exist in this world and leverage the data in XML documents?

Parsers have allowed developers to read and write XML from their preferred programming languages. These access methods are the strength of XML because they remove the burden of developing code so that object-oriented applications can read serialized XML data. In my career, I've used them all: DOM, SAX and Pull parsers, among others.

Slow Process
These low-level XML access methods work to a limited extent, but parsers can be tedious and prone to error.

In my experience, I've always found it labor-intensive to teach a DOM parser how to process a document. I liken it to teaching kindergarteners how to read, except for the fact that every time you begin a new book you have to start over from scratch. The same applies to a SAX parser. I've avoided it by refraining from creating or using large XML documents.

As a result, XML parsers have always given me an uneasy feeling because of the work involved in putting them in operation in C++. On the proverbial Friday night, a few hours from weekend freedom, I might have a tendency to quickly give up using XML and a parser for my I/O works, and find instantaneous salvation in a few STL streaming operator or worst fprintf() calls.

As developers know, writing a program is like writing a murder mystery. When the book is finished, if the publisher wants to change who the murderer is, the author has to rewrite the whole thing to ensure story and plot lines logically lead the reader to the ending of the book.

Software development follows a similar pattern. If the demands of customers and requirements change, the C++ data structures must reflect those changes. In this demanding marketplace with cost and efficiency pressures, how can the C++ developer keep up?

Mapping Documents to Classes
The answer is XML binding, which maps XML documents to C++ objects created to represent the elements or data in the document, as well as the reverse-converting C++ objects to XML. XML binding is usually thought of as one more way to read or write XML. While true, that's a very limiting definition. For the C++ developer, XML binding can enable development efficiency and modernization. It allows the C++ developer to cope with constantly evolving requirements, create C++ classes quickly and expand the possibilities of what C++ can solve.

[click image for larger view]

XML binding, based on XML schemas, marshalls C++ classes to XML.

XML binding is based on XML schemas, which are a simple file specification that identify "objects" and their contents. If XML is a prominent part of your C++ class development-that is, in specifying using XML Schemas-XML binding will allow you to convert your C++ class to XML, a process called marshalling. Unmarshalling is mapping XML to C++ classes.

For example, suppose your tech lead comes to you wanting to save or load data from XML. Regardless of when XML becomes a requirement, in one line of code, you can marshal your C++ class to XML:

std::ofstream ostrm ("C:\\temp\\portfolio.xml");
ostrm << myPortfolio.marshal();
ostrm.close();

If "read XML" is also a requirement, the code may look like this:

tns: :Portfolio myPortfolioReloaded;
std: :ifstream istrm("C:\\temp\\portfolio.xml");
std: :stringstream buffer;
buffer << istrm.rdbuf();
std: :string xmlContents (buffer.str());
myPortfolioReloaded.unmarshal (xmlContents);

With these very simple steps, C++ becomes XML-enabled and can now fit many programming patterns. It's an alternative to the traditional parsing methods; you can produce more powerful code with less-tedious bridging, and you have more flexibility and speed in what you can offer when app requirements change.

Automation
XML binders can help you automate code generation and develop C++ classes with a powerful programming interface. This type of tooling allows you to build complex business logic in C++ much more easily. However, some XML-processing solutions can get you lost in the weeds by generating C++ code with three levels of C pointer intricacy and no documentation. You should look for binders that provide documentation, STL-based interfaces, makefiles, Visual Studio projects and extremely readable code.

To be clear, I'm not suggesting automating a full development process, or specifying a whole project with XML schemas. C++ is cool, too. The focus is to use XML binding as another way to easily specify and develop C++ classes.

About the Author

Stephane Raynaud is a senior architect at Rogue Wave Software Inc.

Printable Format

comments powered by Disqus

Featured

VS Code v1.99 Is All About Copilot Chat AI, Including Agent Mode

Agent Mode provides an autonomous editing experience where Copilot plans and executes tasks to fulfill requests. It determines relevant files, applies code changes, suggests terminal commands, and iterates to resolve issues, all while keeping users in control to review and confirm actions.
Windows Community Toolkit v8.2 Adds Native AOT Support

Microsoft shipped Windows Community Toolkit v8.2, an incremental update to the open-source collection of helper functions and other resources designed to simplify the development of Windows applications. The main new feature is support for native ahead-of-time (AOT) compilation.
New 'Visual Studio Hub' 1-Stop-Shop for GitHub Copilot Resources, More

Unsurprisingly, GitHub Copilot resources are front-and-center in Microsoft's new Visual Studio Hub, a one-stop-shop for all things concerning your favorite IDE.
Mastering Blazor Authentication and Authorization

At the Visual Studio Live! @ Microsoft HQ developer conference set for August, Rockford Lhotka will explain the ins and outs of authentication across Blazor Server, WebAssembly, and .NET MAUI Hybrid apps, and show how to use identity and claims to customize application behavior through fine-grained authorization.
Linear Support Vector Regression from Scratch Using C# with Evolutionary Training

Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the linear support vector regression (linear SVR) technique, where the goal is to predict a single numeric value. A linear SVR model uses an unusual error/loss function and cannot be trained using standard simple techniques, and so evolutionary optimization training is used.