Streamline Mapping With Orcas and LINQ

Use a pair of LINQ Technology Preview add-ins that integrate with .NET 2.0 and VS 2005 to take a look into the future of VS and data.

Technology Toolbox: Visual Basic; C#; SQL Server 2005; XML; Cω compiler and plug-in for VS 2003 from Microsoft Research; Language Integrated Query (LINQ) Technology Previews for .NET 2.0 bits downloaded from Microsoft's LINQ Project site; VB Express, C# Express, VS 2005 Standard or higher (RTM version)

It's a good bet that 80 percent or more of today's business-oriented software projects combine general-purpose programming languages (GPPLs) and database programming languages (DBPLs).

But the ubiquity of database connectivity for Web and smart-client applications or components hasn't reduced the disconnect—often called an impedance mismatch—between GPPLs and DBPLs. Native programming languages that generate, inspect, transform, or otherwise manipulate XML documents, such as XPath, XQuery, and XSLT (XMLPLs), currently aren't as commonly used as DBPLs but exhibit a similar mismatch to GPPLs (see the sidebar, "Cure Data Type Impedance Mismatch With LINQ"). Integration of XML Infosets as the data interchange medium in service-oriented projects is accelerating, so XMLPLs ultimately might rival the popularity of DBPLs, such as ANSI SQL, Microsoft Transact-SQL, and Oracle PL/SQL.

One fact is abundantly clear: Orcas (the code name for the next version of Visual Studio) will significantly change some aspects of how you handle and manipulate data with your applications and components. I'll give you a preview of the .NET data-management enhancements and language extensions you can expect to release with VS' forthcoming Orcas version. Most of the early programming examples for these topics and related Professional Developers Conference (PDC) 2005 sessions use C#, so I'll emphasize VB syntax where applicable.

The most important disconnect between GPPLs—such as C#, VB7+, or Java—and DBPLs or XMLPLs is the lack of a single compatible type system. Such a type system must be sufficiently flexible to accommodate the sophisticated language constructs, object orientation, inheritance, polymorphism, and other features of today's GPPLs. The type system also must support DBPLs' relational tables, SQL queries, and nullable data types; and hierarchical XML documents or fragments, elements, and attributes for XMLPLs. .NET 2.0 has three basic type systems: SQL for relational table data types; XSD for XML, XPath, XSLT, and—prior to beta 2—XQuery; and CLR for GPPL requirements. The ADO.NET 2.0 API bridges SQL and CLR data types; SQLXML 4.0 ties SQL to XSD; and System.Xml spans XSD and CLR. Different APIs for each data type cause developers to spend an inordinate amount of time to learn, write, debug, and rewrite brittle plumbing code. The usual culprits that break the pipes are bad SQL query strings, or XML tags or content that don't get checked until run time.

One of the goals of the Orcas release's .NET 3.0, VB 9.0, and C# 3.0 is to simplify database and XML plumbing code dramatically. Related goals are to enable the VB or C# compiler to construct and validate SQL-style queries, XPath-style operators, and XML instances or literals as first-class, CLR-compatible objects. Achieving these goals requires extending the .NET CLR type system and customizing the VB and C# compilers. The payoff for .NET developers will be increased productivity with early binding, compile-time validation, and IntelliSense for SQL-style queries, resultsets, and XML literals.

Microsoft Research's Cω (C-Omega, formerly X# and Xen) extensions to the .NET type system and C# language were the first steps to a unified type system that treats SQL-style queries, query resultsets, and XML content as full-fledged members of the .NET 1.1 CLR type system. Cω introduced the stream type, which is analogous to the .NET Framework 2.0's System.Collections.Generic.IEnumerable<T> type and contains an ordered, homogeneous set of zero or more items. You use the C# foreach statement to access all the stream's items. Cω also defined constructors for typed tuples (called anonymous structs). Anonymous structs differ from conventional C# structs as follows: Anonymous structs don't have an explicit type name; fields are ordered so you can access them from an array index; fields need not be named (but must be typed); multiple fields with the same name are allowed and return a stream when accessed by name.

Generate CLR Objects
Streams that model relational tables contain sets of anonymous struct items to represent rows. You use a Cω tool (SQL2Comega.exe) to generate a managed assembly from SQL Server's Northwind sample database. A reference to the assembly provides the Northwind namespace and a global, strongly typed instance (DB) of the Database type. DB public members are Northwind tables and views; DB methods represent table-valued functions. This simple Cω console-application code generates and displays values from a stream (rows) of anonymous structs (row) that contain three named fields of the System.Data.SqlTypes.SqlString type from the DB.Customers table:

using System;
using Northwind;

public class Test {
   public static void Main( string[] args ) {
         CompanyName   ContactName");
      string cityname = "London";
      rows = select CustomerID, CompanyName, 
         ContactName from DB.Customers 
         where City == cityname
         orderby ContactName;
      foreach( row in rows ) { 
         Console.WriteLine("{0,-12} {1,-21} {2}", 
         row.CustomerID, row.CompanyName, 
      Console.Write("\nPress any key to exit");

Notice that the rows variable doesn't have a type specifier, and it consists of SQL-keyword (select-from-where-orderby) operators with column-name and table-name arguments. The compiler infers the type (stream) of the rows variable from the right-hand side of the assignment statement and validates the rows (query) variable at compile time. The foreach statement's row variable is an instance of this anonymous struct:

struct (SqlString CustomerID; SqlString 
   CompanyName; SqlString ContactName)

Substituting row[0], row[1], row[2] for row.CustomerID, row.CompanyName, row.ContactName in the Console.WriteLine statement doesn't change the console output.

Iterating the stream produced from the SQL-style query with the foreach statement returns this query resultset (tuples) to the console:

Press any key to continue...
CustomerID   CompanyName   ContactName
EASTC   Eastern Connection   Ann Devon
CONSH   Consolidated Holdings   Elizabeth Brown
SEVES   Seven Seas Imports   Hari Kumar
NORTS   North/South   Simon Crowther
AROUT   Around the Horn   Thomas Hardy
BSBEV   B's Beverages   Victoria Ashworth

Press any key to continue...

For XML instances, the stream type corresponds to the sequence particle that the W3C XML Schema specification and XQuery 1.0 and XPath 2.0 Data Model working draft define. In this case, the stream type contains an ordered, homogeneous set of zero or more XML nodes, atomic values, or a mixture of nodes and atomic values. Cω also defined constructors for discriminated unions. Discriminated unions support XML Schema's choice particle with a choice type that you implement in Cω like this:

public class choiceTest {
   string testName;
   choice (string testerName; int testerId;);
   choice (DateTime testDateTime; 
      string testDateString;);
   string testResults;

Cω also provided nullable types to support instances of value types—such as int, float, or dateTime—whose value is null. Cω nullable types differ from .NET 2.0 Nullable<T> generic types, which provide a wrapper for value types that has a Boolean HasValue property. If HasValue is false, accessing the Value property throws a System.InvalidOperationException. Cω nullable types return null when you access a field or property of a null instance.

Anders Hejlsberg confirmed to InfoWorld's Paul Krill in June 2005 that Cω wouldn't emerge as a separate product nor would Cω features be incorporated into VS 2005 (articlehere). The terminal status of Cω as a research language explains my use of past tense to describe Cω-specific features. However, Hejlsberg said that VS 2005 provides the "cornerstones for the next level of progress that we can make into this domain of better integrating data with programming languages." The new cross-language project, ".NET Language Integrated Query Foundation" (IQF), became "Language Integrated Query for .NET" (LINQ) after final publication of PDC 2005's session abstracts. You can download individual VB and C# LINQ technical previews for the VS 2005 release candidate from Microsoft; get this article's download code here).

LINQ is a standardized query-definition API for multiple .NET languages and data-source types. The LINQ preview bits are a .NET 2.0 add-in that supports the current VB 8.0 and C# 2.0, and compilers, so you needn't wait for an Orcas beta drop to test-drive Cω's successor. PDC 2005's initial C# 3.0 LINQ technology preview supports relational data with the DLinq API for ADO.NET access to remote SQL databases. A subsequent release of the VB 9.0 LINQ technology preview will support DLinq with SQL-style query comprehension operators (see Table 1). The XLinq API provides a lightweight, high-performance surrogate for the XML DOM and supports XPath- and XQuery-like query operators (see Figure 1).

Important language additions to VB 9.0 will be lambda functions, anonymous types, anonymous arrays, object initializers, nullable types, and extension methods. Cω integrated the SQL2Comega.exe and XSD2Comega.exe command-line assembly generators with the VS 2003 IDE. Future DLinq drops undoubtedly will include a command-line code generator to create partial classes from database metadata (see Listing 1). The 2,150-line Northwind.cs file from the \Program Files\LINQ Preview\Samples\SampleQueries folder demonstrates the need to autogenerate the entity classes that define database tables and relationships as objects. It's also reasonable to expect XLinq to provide an integrated codegen tool for creating XML entity classes from XSD documents.

Anonymous methods in C# 2.0 are analogous to Scheme's lambda functions; you can pass unnamed lambda functions as arguments, assign them to variables, or store them as parts of a data structure. C# 2.0 uses anonymous methods to simplify code for event handlers by passing a code block as a delegate parameter. Anonymous methods consist of the delegate keyword, an optional parameter list, and a statement list enclosed within curly-brace delimiters. The VB 8.0 compiler doesn't support anonymous methods directly, but VB 9.0 and C# 3.0 will add lambda functions to support query-operator arguments. For example, the VB 9.0 compiler will translate the Where clause of this variable declaration for a SQL-style query's resultset:

Dim rows = Select c.CompanyName, c.City _
   From c In Customers Where c.Region = "CA"

to a call to this extension method:

Public Delegate Function Predicate(Of T) _
   (By Val v As T) As Boolean

Shared Function Where(Of T)_
   (seq As IEnumerable(Of T), _
   p As Predicate(Of T)) _
   As IEnumerable(Of T)
End Function

This creates the emphasized inline (anonymous) function:

Dim rows = Customers.Where((c) Return 
   c.Region = "WA").Select(...)

which the VB compiler translates to:

Function Lambda(c as Customer) As Boolean
   Return c.Region = "CA"
End Function

that's called by:

Dim rows = Customers.Where(AddressOf _

When connecting to remote SQL databases, DLinq sends the SQL statement to the server so the anonymous function returns only rows that meet the Where operator's criteria and columns specified by the Select operator's field list.

VB 9.0's anonymous types will correspond to Cω's anonymous structs. The VB 9.0 compiler will translate the Select expression of the same SQL-style query:

Dim rows = Select c.CompanyName, c.City _
   From c In Customers Where c.Region = "CA"

to this function that generates the equivalent of a Cω stream by calling the Select extension method:

Public Delegate Function Func(Of T, U) _
   (ByVal v As T) As U

Shared Function Select(Of T, U) _
   (seq As IEnumerable(Of T), _
   m As Function(Of T, U)) _
   As IEnumerable(Of U)
End Function

which maps T to U and generates the emphasized new type in this dot-syntax method declaration:

Dim rows = Customers.Where((c) Return c.Region = _
   "WA").Select((c) New  {. CompanyName _
   := c.CompanyName, .City := c.City })

that corresponds to the following nameless class (the name _Anonymous1 is used as an illustration only) and its variable declaration:

Class _Anonymous1
   Public CompanyName As String
   Public City As String
End Class

Dim rows = Customers.Where((c) Return c.Region = _
   "CA").Select((c) New Anon1 { .CompanyName _ 
    := c.CompanyName, .City := c.City })

The VB rows variable exposes the anonymous type that contains an instance of the unnamed class for each row. Running under VS 2005 enables databinding by casting and assigning the resultset to, for example, a DataGridView control's DataSource property with this statement:

Me.dgvResult.DataSource = CType(rows, DataTable)

DLinq will support table update operations with dynamic SQL or stored procedures (see Figure 2) and use the new .NET 2.0 Systems.Transactions model's TransactionScope methods for local and distributed transactions. Thus, DLinq is a candidate for middle-tier or smart-client components that update multiple tables.

According to Samuel Druker, development lead on the WinFS team who presented "WinFS' and ADO.NET: Future Directions for Data Access Scenarios" (DAT312) at PDC 2005, WinFS will use LINQ for data modeling and programming. If WinFS depends on ObjectSpaces' OPath or a newer OSQL as its filter language, LINQ might gain an API for another query language that could be named "OLinq."

Microsoft makes no guarantee at this point that the Orcas release version will include LINQ. It's possible that LINQ might suffer the same fate that befell ObjectSpaces and XQuery in VS 2005 and .NET 2.0. I believe that LINQ, DLinq, XLinq, and possibly "OLinq" will be the primary incentives for data-oriented .NET developers to upgrade to VS .Next; if so, there's a high probability that LINQ will make the RTM cut. If you have the latest LINQ preview bits, give them a test-drive with VS 2005 or its Express editions. You also can download and run the Cω plug-in for VS 2003 and C# 1.1 as a preview of the LINQ preview see here. I found comparing Cω code to that for LINQ, DLinq, and XLinq to be an interesting experience. I'm sure you'll agree that LINQ technology will correct the "impedance mismatch" and dramatically reduce the time and effort needed to implement .NET relational-to-object and XML-to-object mapping.

comments powered by Disqus
Upcoming Events

.NET Insight

Sign up for our newsletter.

I agree to this site's Privacy Policy.