In-Depth

Store Large Lookup Tables in DataSets

When mobile users need to look up and edit data without direct network connections, persist DataSets of lookup information as local XML files to preserve offline updates.

Technology Toolbox: VB.NET, SQL Server 2000, XML, Visual Studio .NET 2003

A best practice when generating DataSet objects is to minimize the number of records returned by SqlDataAdapter objects. This works for VS.NET applications whose users have direct network connections to the underlying databases. However, this recommendation doesn't apply to seldom-connected mobile users who must look up records and edit data while they're in customers' offices, at pipeline pumping stations, on tankers at offshore terminals, or in other remote locations. Offline updates and inserts must be preserved between device reboots and survive updates that fail from concurrency conflicts.

One approach is to set up merge replication with the Microsoft SQL Server Desktop Engine (MSDE) 2000 on laptops or SQL Server CE on Pocket PCs. An alternative method is to persist large DataSets of lookup information, such as product specs, customer records, recent order data, and pending updates and inserts as local XML files. In this article, I'll describe the benefits and drawbacks of saving, loading, and refreshing locally persisted DataSets that range in size from 250K to more than 20 MB. You can run your own tests with the OakLeaf Consumer Electronics' (OCE) OCETestClient VB.NET project I've included with this article (download the project here). It generates resource consumption and performance metrics, which I'll discuss later.

Caching product data for reference or order-entry/order-editing operations is a common requirement of usually disconnected client apps. In most cases, users download an initial set of catalog information and refresh the data periodically when connected to the database by a VPN. Users without a LAN or VPN connection can consume a simple SQLXML 3.0 stored procedure Web service that returns individual DataSets for product categories and line items or a nested DataSet with categories and items tables. Secure Sockets Layer/Transport Layer Security (SSL/TLS) encryption protects confidential information over a direct client to Web service end-point Internet connection. Alternatively, you can use Web Service Extensions 1.0 or later to encrypt the Simple Object Access Protocol (SOAP) return messages.

Write code similar to this to create a new DataSet (dsProds) from a simple Web method (getProds) that delivers a single product DataSet and persists its schema and data as an XML file:

'Create the DataSet and XML file from an 
'SQLXML3 Web service
Dim prxGetProds As New getProds.ProdWS
dsProds = New DataSet
dsProds = prxGetProds.getProds(0)
dsProds.WriteXml(strFile, _
   XmlWriteMode.WriteSchema)

The OCE_Prods sample database's ProdWS Web service delivers detailed data for 695 consumer electronics products and generates a 195K dsProds.xml file (see Listing A). Invoking the getProdCats Web method creates a 3K dsProdCats.xml file with 16 category codes and descriptions (see Listing B). A dsProds.ReadXml(strFile, XmlReadMode.ReadSchema) statement loads the persisted file into a new DataSet when users start a session. Ordinary users don't update product catalogs, so you needn't be concerned with tag names, primary or foreign keys, and relations. Use integer table and column indexes to populate dropdown lists or DataGrids and avoid the resource consumption of strongly typed DataSets.

Create Editable DataSets
Creating editable DataSets from Web services or by using the SqlDataAdapter.Fill() method requires conforming Web service rowset tag names to the SqlDataAdapter's NewDataSet and removing a spurious Web service attribute (see Listing 1). The OCE_Cust sample database's Customers table uses an identity column as the primary key (see Listing C). These statements initialize the dsCusts DataSet with the updated data and set the PrimaryKey and AutoIncrement properties of the first (CustID) column:

dsCusts.AcceptChanges()
tblCusts = dsCusts.Tables(0)
Dim datCol(1) As DataColumn
datCol(0) = tblCusts.Columns(0)
tblCusts.PrimaryKey = datCol
datCol(0).AutoIncrement = True

Reading large local XML files into DataSets is a surprisingly quick process on clients with fast processors and disk drives. A 10 MB XML file that contains 25,000 customer records loads in less than five seconds under Windows XP SP-1 with a 2.26 GHz Pentium 4 with 512 MB RAM and an Ultra ATA/100 drive. With files of 1 MB or more, DataSet loading time from XML files is directly proportional to file size. Loading and saving a 10 MB local XML file with a SqlDataAdapter requires less than two seconds with a 100 Mbps network connection to the database server. You don't need Connection.Open and Connection.Close statements for data adapters; the DataAdapter.Fill() method opens and closes the associated database connection automatically. Creating the DataSet from a SQLXML 3.0 Web service takes about 40 seconds with a high-speed Internet connection; HTTPS transport increases the loading time to about a minute.

What's more surprising is the big-time DataSet resource consumption: The VS.NET 2003 app's RAM requirement increases by as much as 15 times the size of the XML file. Upgrading the client to the March 2004 VS.NET 2005 Community Technical Preview shows that the Visual Studio development team has made progress in reducing large DataSets' memory footprints; a 10 MB XML file expands to about half the size of the VS.NET 2003 version. Fortunately, the ratio of RAM requirements to XML file size decreases markedly as file sizes increase from 1 MB to 20 MB. You can compare the RAM requirements and local file-loading times for both VS.NET versions (see Table 1).

Offline editing operations require persisting all pending changes to the dsCusts DataSet as an XML file in DiffGram format by creating a temporary dsChanges DataSet and saving it with these statements:

Dim dsChanges As New DataSet
dsChanges = dsCusts.GetChanges()
Dim strFile As String = Application.StartupPath + _
   "\DiffGram.xml"
dsChanges.WriteXml(strFile, _
   xmlWriteMode.DiffGram)

Submitting updates and inserts to the database successfully also returns records updated after the creation date of the local XML file, accepts changes to the dsCusts DataSet, and erases the DiffGram.xml file (see Listing D). The user must persist the DataSet before closing the app if he isn't able to connect to the database and submit his changes (see Figure 1). This action overwrites the existing file with data that doesn't include pending changes. If the DiffGram file exists when the user starts a new session, these statements in the Form_Load event handler add the DiffGram changes to the dsCusts DataSet:

strFile = Application.StartupPath + _
   "\DiffGram.xml"
If File.Exists(strFile) Then
   'There are unsubmitted changes; 
   'add them to the DataSet
   dsCusts.ReadXml(strFile, _
      XmlReadMode.DiffGram)
End If

Persist Edited Data
Concurrency conflicts nullify database updates, so it's important to persist the edited data for subsequent conflict resolution. An OnRowUpdated handler for the daCusts data adapter's RowUpdated event adds rows to the dsEdits DataSet when submitting records that cause concurrency conflicts (see Listing 2). The retrieval of records that caused the concurrency errors lets users apply the saved edits selectively.

The OCETestClient.xsn solution demonstrates typical UIs for offline data display and editing. The default Products tab page has a Product Categories combo box that specifies the dsProds.Tables(0).Select string criterion to create a DataRow array that fills the "Products in Category" combo box. The same criterion generates a filtered DataView to populate the DataGrid with product data (Listing 3). Selecting a product selects the corresponding row, which could be used to insert or replace a line item during an order-processing operation (see Figure 2).

The Customers tab page lets the user search for customer records by an exact ID, any part of an e-mail address, and exact or partial telephone numbers and ZIP codes. Partial searches make use of a seldom-publicized feature of the DataSet.Select() method—you can use SQL's LIKE predicate in the Select string. Visual Studio .NET and most other DataSet code examples demonstrate an exact match. Use ColumnName LIKE '%Match%' to match columns containing Match (the XPath contains function) or ColumnName LIKE 'Match%' beginning with Match (the XPath starts-with function). Users select the record to view from a dropdown list (see Figure 3 and Listing 4). Searching 25,000 to 50,000 customer records with exact or partial values is quick. Exact searches on the primary key complete in less than 16 milliseconds (ms), which is the resolution of the Now.Ticks timer; partial-match searches take less than 48 ms.

Selecting a single customer record enables the Edit Row button, which users click on to set the ReadOnly property of the textboxes to False and enable the Save Row Changes button (see Figure 4). If the user changes the row's data and clicks on Save Row Changes, the Accept All Changes, Reject All Changes, and, if the user has a live LAN or VPN connection, Submit Changes buttons are enabled. Clicking on Submit Changes sends the updates and receives rows that have been updated since the creation date of the local dsCusts.xml file, and disables the three buttons. If an update causes a concurrency violation, a dropdown list of CustID, FirstName, and LastName values appears on the form to let the user retrieve the updated record and decide whether to reattempt the update.

Providing disconnected users with full-featured data display, searching, and editing capabilities requires a lot of handwritten code; preserving updates and failed edit data between sessions adds a few hundred more lines. Loading 10 MB XML files on a Pocket PC might not be practical, but moderate-performance laptops with 512 MB RAM probably can handle up to 10 MB DataSet files without driving users nuts with session startup delays. High-performance clients undoubtedly can accommodate 50 MB local XML files. Download the sample code for this article, attach the OCE_Prod_dat.mdf and OCE_Cust_dat.mdf databases to SQL Server or MSDE 2000, and give the 1,500-line OCETestClient.xsn project a test drive on your development machines.

About the Author

Roger Jennings is an independent XML Web services and database developer and writer. His latest books include "Special Edition Using Microsoft Office Access 2007" (QUE Books, 2007) and "Expert One-on-One Visual Basic 2005 Database Programming" (WROX/Wiley, 2005). He’s also a VSM contributing editor and online columnist and manages the OakLeaf Systems blog. Jennings’ Code of Federal Regulations Web services won Microsoft’s 2002 .NET Best Horizontal Solution Award. Reach him at [email protected].

comments powered by Disqus

Featured

  • Microsoft Revamps Fledgling AutoGen Framework for Agentic AI

    Only at v0.4, Microsoft's AutoGen framework for agentic AI -- the hottest new trend in AI development -- has already undergone a complete revamp, going to an asynchronous, event-driven architecture.

  • IDE Irony: Coding Errors Cause 'Critical' Vulnerability in Visual Studio

    In a larger-than-normal Patch Tuesday, Microsoft warned of a "critical" vulnerability in Visual Studio that should be fixed immediately if automatic patching isn't enabled, ironically caused by coding errors.

  • Building Blazor Applications

    A trio of Blazor experts will conduct a full-day workshop for devs to learn everything about the tech a a March developer conference in Las Vegas keynoted by Microsoft execs and featuring many Microsoft devs.

  • Gradient Boosting Regression Using C#

    Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the gradient boosting regression technique, where the goal is to predict a single numeric value. Compared to existing library implementations of gradient boosting regression, a from-scratch implementation allows much easier customization and integration with other .NET systems.

  • Microsoft Execs to Tackle AI and Cloud in Dev Conference Keynotes

    AI unsurprisingly is all over keynotes that Microsoft execs will helm to kick off the Visual Studio Live! developer conference in Las Vegas, March 10-14, which the company described as "a must-attend event."

Subscribe on YouTube