Q&A

Track Changes With XML DataSets

Track changes in files using XML datasets with the DiffGram format. This format lets you track what has changed, and what hasn't.

Technology Toolbox: C#, XML, ADO.NET

Q:
Serialize a DataSet
I'm working on a project that needs to store different versions of the same XML document. My approach has been to read the XML document into a DataSet, then let the user modify the DataSet. Once the user finishes modifying the file, I use the GetChanges method to retrieve the changes and drop them into another DataSet. I store the second DataSet as another XML file, so I can use the Dataset.Merge method to merge it later with the original and get the modified version.

However, I'm having trouble with the GetChanges method. It's not returning the deleted rows, even though it's working fine for the added or the updated rows. GetChanges returns an empty DataSet when I delete any row from the DataSet.

For reference, here's the code I'm using:

ds.ReadXml(
   "http://localhost/Proto/Props.xml");
ds.AcceptChanges();
ds.Tables["LineDetail"].Rows[2].
   Delete();
DataSet ds1;
ds1 = ds.GetChanges(
   DataRowState.Deleted);
ds1.WriteXmlSchema(
   "c:\\ChangedSchema.xml");
ds1.WriteXml("c:\\ChangedDoc.xml");
// ChangedDoc.xml is empty  :-(

A:
There are many ways to serialize a DataSet; sometimes it's simply a matter of determining the best one for what you're trying to do. The approach in your sample, WriteXml, writes the current version of each data element. The DataSet returned by GetChanges() contains only one deleted row. That row has no current version. You get no output.

You've got two different options, depending on your needs. Let's assume you want to capture those rows that are about to be deleted. You could simply reject the changes before you write the file:

ds.ReadXml(
   "http://localhost/Proto/Props.xml");
ds.AcceptChanges();
ds.Tables["LineDetail"].Rows[2].
   Delete();
DataSet ds1;
ds1 = ds.GetChanges(
   DataRowState.Deleted);
ds1.RejectChanges();
ds1.WriteXmlSchema(
   "c:\\ChangedSchema.xml");
ds1.WriteXml("c:\\ChangedDoc.xml");
// ChangedDoc.xml is empty :-(

Rejecting the changes restores the current state of Row 2 in the LineDetail table. It's current and it exists, so it gets written to the file. This is the preferred way to make a backup of the deleted rows before you delete them. A single call to Dataset.Merge restores the deleted rows.

Unfortunately, calling RejectChanges leaves you with other problems. Your changes have been lost if they include added rows. It's the inverse of the problem you're observing: After you reject changes, the deleted rows are back, but new rows don't exist yet. The same is true of rows you change. Rejecting changes reverts those rows to the previous values, losing all the work.

A different approach lets you see the rows that have changed, regardless of what kind of changes the user has made. The DiffGram format stores two partial copies of the DataSet so you can track changes. One copy of the DataSet stores the current version of all records in each table, including any modified rows, newly added rows, and unmodified rows (see Figure 1).

The second copy of the DataSet stores the previous version of any rows that have changed. These rows are stored in the diffgr:before element of the serialized DataSet. The only copy of any deleted rows is also stored in this section. Deleted rows existed in the "before" picture but not in the "after" picture of the DataSet. Using this approach enables you to save the history of changes to the DataSet, not just to a snapshot of its current or previous contents. By saving both versions, the DiffGram format lets you see exactly what changes have been made:

ds.ReadXml(
   "http://localhost/TMP/Props.xml");
ds.AcceptChanges();
ds.Tables["LineDetail"].Rows[2].
   Delete();
DataSet ds1;
ds1 = 
   ds.GetChanges(DataRowState.Deleted);
ds1.WriteXmlSchema(
   "c:\\ChangedSchema.xml");
ds1.WriteXml("c:\\ChangedDoc.xml", 
   XmlWriteMode.DiffGram);
// ChangedDoc.xml has a 
// diffgr:before record.

The DiffGram format preserves all the information you need to transfer the set of changes to another machine, merge them into another DataSet, undo them, or save them for later processing (see Listing 1).

The DiffGram format has some extra attributes and elements in addition to the normal DataSet information. For example, note that a diffgr:id attribute is associated with each element. This attribute matches the current and previous version of any changed rows. The first record in the DataSet has been modified: I've changed the last name. You can see the previous last name in the diffgr:before section, in the record that has the matching diffgr:id, 1. You can also see that the first record has a diffgr:hasChanges attribute. The first record was modified, so this attribute has the "modified" value. Now look for the inserted record. It has a value of "inserted" for diffgr:hasChanges. The absence of this attribute indicates that the record wasn't changed.

Next, look at the bottom of the file in the diffgr:before section. Here you find a deleted record's original value, with a diffgr:id = Employees9. There is no diffgr:hasChanges attribute in the diffgr:before section. You can tell it's a deleted record because there's no record in the current version with the diffgr:id value of Employees9.

The ADO.NET DataSet is a powerful container. You can, and should, become familiar with the different ways you can use it to transfer information. The DiffGram format is especially useful to indicate all the changes that have been made to a set of records. It provides a convenient way to track any modifications made to your data.

About the Author

Bill Wagner, author of Effective C#, has been a commercial software developer for the past 20 years. He is a Microsoft Regional Director and a Visual C# MVP. His interests include the C# language, the .NET Framework and software design. Reach Bill at [email protected].

comments powered by Disqus

Featured

  • Compare New GitHub Copilot Free Plan for Visual Studio/VS Code to Paid Plans

    The free plan restricts the number of completions, chat requests and access to AI models, being suitable for occasional users and small projects.

  • Diving Deep into .NET MAUI

    Ever since someone figured out that fiddling bits results in source code, developers have sought one codebase for all types of apps on all platforms, with Microsoft's latest attempt to further that effort being .NET MAUI.

  • Copilot AI Boosts Abound in New VS Code v1.96

    Microsoft improved on its new "Copilot Edit" functionality in the latest release of Visual Studio Code, v1.96, its open-source based code editor that has become the most popular in the world according to many surveys.

  • AdaBoost Regression Using C#

    Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the AdaBoost.R2 algorithm for regression problems (where the goal is to predict a single numeric value). The implementation follows the original source research paper closely, so you can use it as a guide for customization for specific scenarios.

  • Versioning and Documenting ASP.NET Core Services

    Building an API with ASP.NET Core is only half the job. If your API is going to live more than one release cycle, you're going to need to version it. If you have other people building clients for it, you're going to need to document it.

Subscribe on YouTube