Database Design

Use DataSets as Business Objects

Enhance the DataSet class with inheritance and extend its associated XML schema to hook validation checks and custom validation routines into DataSets declaratively.

P>Technology Toolbox: C#, XML

In a service-oriented application based on the application/domain architectural model, lightweight business objects are responsible for encapsulating their data to trigger proper field-level validation and validation of the whole set of data the business object manages (see the sidebar, "Understand Design Patterns"). Unfortunately, .NET's primary candidate for holding your business data—the ADO.NET DataSet—doesn't provide a natural way to encapsulate and protect the data it contains. The DataSet triggers events when data changes, but it delegates validation responsibilities to outer classes, thereby breaking the principles of encapsulation and self-containment. "Naked" DataSets might be acceptable for some simple applications, but larger applications require a better container.

The mainstream approach to this issue is to define your own business objects. This typically implies the definition of a layer super-type and a custom framework that provides low-level services, especially for Object-Relational (O-R) mapping. This solution has several benefits, but it has drawbacks too. First, it requires you to renounce the many tools and wizards in VS.NET that work on the DataSet data type—a high price to pay in terms of productivity. Second, you must develop your own O-R mapping routines instead of using the .NET Framework's out-of-the-box mapping tool—the DataAdapter class.

This article explores the easiest way to extend the DataSet class to make it protect its own data: augmenting the DataSet type with standard object-oriented programming and some runtime metadata-based techniques. This enables you to use both standard custom-validation code and a declarative approach to hook validation modules automatically into the DataSet object. Both types of code trigger automatically when the data is modified or saved.

You might wonder if extending typed DataSets is a viable solution for data encapsulation. Unfortunately, it isn't. The whole typed DataSet is regenerated after each schema change, so writing validation code within the typed field-setter methods isn't a realistic option. Also, even if you manage to protect typed setter methods, setter methods inherited from the DataRow base type aren't overridable, so you could bypass validation checks by setting values through indexers, as you do with untyped DataSets.

The approach I'll describe is based on the implementation of a DataSet-derived class that receives notifications of data changes from the DataSet it inherits from. The relevant notifications are DataTable's ColumnChanging event for single-field validation and DataTable's RowChanging event for row-wide validation. (The RowChanging event fires when row changes are committed with the EndEdit method.)

Create a New Class
You start by creating a new class called SmartDS, which inherits from System.Data.DataSet. This class hooks into the ColumnChanging and RowChanging events of all DataTables defined in DataSet. You hook into the DataTableCollection's CollectionChanged event in the SmartDS constructor (instead of iterating at some point through the collection of tables and hoping no tables will be added later):

public class SmartDS1 : DataSet {
	public SmartDS1() {
this.Tables.CollectionChanged += 	 new 
	CollectionChangeEventHandler(
	this.SchemaChangedHook);
}

private void SchemaChangedHook(
	object sender, 	CollectionChangeEventArgs e) {
	...
}

The preceding code guarantees a notification will occur when a table is added to DataSet, no matter how it's added. You hook the SchemaChangedHook method into the DataTable events you're interested in (and unhook it when tables are removed):

private void SchemaChangedHook(object 
	sender, CollectionChangeEventArgs e) 
{
if (e.Action == 
	CollectionChangeAction.Add ) {
if (e.Element is DataTable ) {

//hook into the ColumnChanging Event
((DataTable)e.Element).ColumnChanging 
	+=new DataColumnChangeEventHandler(
	DS_ColumnChanging); 
//hook into the RowChangin Event
((DataTable)e.Element).RowChanging +=new 
	DataRowChangeEventHandler(
	DS_RowChanging);

	}
}
else if (e.Action == 
	CollectionChangeAction.Remove ) {
//unsubscribe from row and column 
//changing events 
	}
}

DS_ColumnChanging and DS_RowChanging are the two functions that receive ColumnChanging and RowChanging events, respectively.

You must inherit your application's DataSet from SmartDS in order to augment your DataSets with robust validation capabilities. The SmartDS base class exposes two protected virtual methods—OnColumnChanging and OnRowChanging, which you call within the DS_ColumnChanging and DS_RowChanging methods, respectively:

private void DS_ColumnChanging(object 
	sender, DataColumnChangeEventArgs e) 
{
OnColumnChanging(sender, e);
}

private void DS_RowChanging(object 
	sender, DataRowChangeEventArgs e) {
OnRowChanging(sender, e);
}

virtual protected void 
	OnColumnChanging(object sender, 
	DataColumnChangeEventArgs e) {
//Inherited class will override this 
//method to apply 
//validation rules at field level
}

virtual protected void 
	OnRowChanging(object sender, 
	DataRowChangeEventArgs e) {
//Inherited class will override this 
//method to apply 
//validation rules at row level
}

DS_ColumnChanging and DS_RowChanging methods do nothing. They exist to let SmartDS subclasses override them. This technique pushes the events up to the actual business DataSet, where you can write your specific validation logic.

The SmartDS class also publishes a protected, overridable Validate method. You call it before the DataSet is persisted to the database in order to perform DataSet-wide validation. SmartDS subclasses override this method to provide specific validation logic (see Figure 1).

For example, this business DataSet inherits from SmartDS:

public class DSOrders1 : SmartDS1 {
	protected override void 
	OnColumnChanging(object sender,
	DataColumnChangeEventArgs e) {
if(e.Column.ColumnName == "OrderDate" && 
	(DateTime)(e.ProposedValue) < 
	DateTime.Now)
	throw new Exception(
	"Invalid value for OrderDate");
}

protected override void 
	OnRowChanging(object sender,
	System.Data.DataRowChangeEventArgs 
	e) {
	//row level validation code
	}
}

Hook Into Metadata
ADO.NET uses XML schema standards to describe the DataSet structure. I'll show you how you can use the DataSet schema to hook validation routines dynamically into a business DataSet at run time.

You can provide the schema to the DataSet from an external file before you load the data into the DataSet. Otherwise, ADO.NET infers the XML schema from the incoming data and generates the schema on the fly. The first option is faster, because you skip the infer process while loading the data. You can also generate the XML schema with the VS.NET built-in DataSet designer: You define the DataSet schema graphically by dropping tables from the Server Explorer and linking them with the appropriate relations.

A direct one-to-one mapping from database tables to DataSet tables is impossible in a complex scenario. You can't simply drag and drop; however, there's no constraint on how a DataSet can differ from the underlying database tables. The more they differ, the more you must develop ad hoc DataAdapter command objects.

XML's extensible nature makes it tempting to try to add custom information to the DataSet XML schema for hooking validation routines into the business DataSet at run time. Unfortunately, neither ADO.NET nor XML classes let you grab the XML schema element associated with a DataRow or DataTable instance easily. However, you have a way out. DataSet, DataTable, DataColumn, and DataRelation expose a property named ExtendedProperties, which returns a HashTable. You can set ExtendedProperties at run time and at design time. You must insert them into the DataSet XML schema to set them at design time. ExtendedProperties you set on the XML schema at design time are available at run time.

You can use this extensibility point to define a grammar within the XML schema to declare business-object-validation requirements at field, row, table, and DataSet level (see Listing 1). Field- and row-validation rules trigger automatically (as you've seen already), and table- and DataSet-validation routines execute when the SmartDS' Validate method is called.

You can hook both "hard-coded" validations, such as regular expressions, and generic validation components whose names are declared in the XML schema. You can load and invoke them dynamically at run time by using their class and assembly names, because they implement a specific interface that acts as a contract between the validation modules and the SmartDS class.

The schema in Listing 1 includes a couple of basic validation rules: a "Field can't be null" rule attached to the CompanyName field, and a regular-expression rule attached to the Email field. Also, a couple of validation classes are attached to the Customer table—one for row-level validation and one for table-level validation.

You can change SmartDS' OnColumnChanging method to hook validation modules according to the directives you define in the ExtendedProperties HashTable (see Listing 2). This article's sample code includes additional related implementations.

OnColumnChanging is a long method, but you need to write it only once—in the SmartDS base class. Remember that subclasses can override this method, typically to add some more validation code to the metadata-based validation code. Subclasses must call the base-class method when they're done to trigger metadata-based validations (unless they have a good reason not to).

This code shows how a business DataSet overrides one of the validation methods:

protected override void 
	OnColumnChanging(object sender, 
	DataColumnChangeEventArgs e) {
//do some validation logic here 
//then call metadata based validation 
//logic
	base.OnColumnChanging (sender, e);
}

You can cache validation modules into something like a static HashTable, because they're stateless. This improves performance, because they're created only once during application execution.

Weigh the Pros and Cons
Using augmented ADO.NET DataSets as business objects provides a number of benefits. First, you can take advantage of ADO.NET's built-in O-R mapping support, which lets you use the DataAdapter class to feed and persist DataSets from the database. Second, you retain full data-binding support. Although custom classes data binding has improved significantly in .NET, taking full advantage of all data-binding-related functionality—especially in rich-client data binding—requires extra effort you don't need to expend with the augmented DataSet. Third, you can take advantage of VS.NET's DataSet-design tools to generate XML-schema-based metadata for DataSets. You have no support for graphical mapping to table fields when you develop custom business objects. Finally, you can implement metadata-based business rules. Custom business objects can use a similar metadata-based approach. However, you must roll your own procedures for metadata generation and loading metadata into the business object, and the algorithm that extracts the metadata information from each node of the data structure. In contrast, DataSet's built-in support for XML schemas at both design time and run time lets you add custom metadata to the VS.NET-provided XML schema and extract the metadata from the ExtendedProperties property at the appropriate time.

The augmented-DataSet approach has its drawbacks too. First, it causes remoting overhead. Transmitting DataSets over process or network boundaries is inefficient. The DataSet memory footprint is large, even when serialized over a .NET Remoting binary channel (because the serialization uses an XML representation). Custom serializable business objects are significantly more efficient. Second, the technique involves untyped programming. You must access business-object fields by position or by keys, so your productivity diminishes because you can't use IntelliSense. The compiler can't help you find programming bugs and misspellings when it accesses data fields; you can trap these errors only at run time. Data returned from the DataSet is loosely typed (the most generic Object type), which can lead to subtle but fatal coding errors you can't detect at compile time. You can address these issues by defining enumerators and string constants with a proper naming standard. This lets you pick up the correct constant quickly to pass to the DataSet, related objects, and indexer accessors without the risk of misspellings. Of course, you must keep enumerators and constants in sync with the DataSet schema, so a tool that generates constants from the XML schema automatically is a requirement for developing a robust application.

A third drawback to the augmented-DataSet approach is its fixed-object structure. The business object's structure is tied to the DataSet's. Moreover, some crucial methods in the underlying objects, such as DataTable and DataRow, aren't virtual, which prevents you from placing further hooks into the DataSet model. As a result, you can't take advantage of some design patterns that require a different kind of object collaboration. However, this issue is mitigated by the fact that most of these patterns apply to the design and implementation of the behavioral part of the business layer, which you move to other classes.

The approach I've described tries to strike a balance between development speed and application robustness. You could improve productivity by developing custom tools—for example, one that inserts custom metadata into the XML schema, or one that generates the string constants defining field and table names from a DataSet schema. The DataSet-based approach might not fit well into highly sophisticated solutions, but small- to middle-sized enterprise applications can no doubt benefit from it.

=
comments powered by Disqus

Featured

  • Get Started Using .NET Aspire with SQL Server & Azure SQL Database

    Microsoft experts are making the rounds educating developers about the company's new, opinionated, cloud-ready stack for building observable, production ready, distributed, cloud-native applications with .NET.

  • Microsoft Revamps Fledgling AutoGen Framework for Agentic AI

    Only at v0.4, Microsoft's AutoGen framework for agentic AI -- the hottest new trend in AI development -- has already undergone a complete revamp, going to an asynchronous, event-driven architecture.

  • IDE Irony: Coding Errors Cause 'Critical' Vulnerability in Visual Studio

    In a larger-than-normal Patch Tuesday, Microsoft warned of a "critical" vulnerability in Visual Studio that should be fixed immediately if automatic patching isn't enabled, ironically caused by coding errors.

  • Building Blazor Applications

    A trio of Blazor experts will conduct a full-day workshop for devs to learn everything about the tech a a March developer conference in Las Vegas keynoted by Microsoft execs and featuring many Microsoft devs.

  • Gradient Boosting Regression Using C#

    Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the gradient boosting regression technique, where the goal is to predict a single numeric value. Compared to existing library implementations of gradient boosting regression, a from-scratch implementation allows much easier customization and integration with other .NET systems.

Subscribe on YouTube