VS 2008: The MIA Features

Visual Studio 2008 is an ambitious new release and it includes a slew of new language features and tools that were required to get LINQ up and running by itself and with SQL Server 200x. Of course, not everything planned made it into the product. Here's a description of the elements that were pared back and what their status is now.

Technology Toolbox: VB, C#, ASP.NET, XML, VS 2008, LINQ

Microsoft often releases ambitious new software products and services that are missing promised components or features. Windows NT "Cairo" and its Object File System (OFS) were announced at the 1992 Windows NT Developers Conference and never materialized into an operating system. The most dramatic of these disappearances was when Microsoft officially dropped the most important "Pillar of Longhorn," the Windows File System (WinFS), from Windows Vista and Windows Server 2008 on June 23, 2006. The most common reason for dropping product elements is to meet an ephemeral release-to-manufacturing (RTM) date that continues to slip even as the new features list shrinks. OFS and WinFS shared two common characteristics: They consumed a substantial part of Microsoft's development resources at the time and were technologically impractical or impossible at the time of their planned release.

Visual Studio (VS) 2008 met the 2007 RTM date that Microsoft's Scott Guthrie projected on Feb. 8, 2007, by RTMing more than a month before year-end. The VS and Data Programmability (ADO.NET) teams weren't about to postpone VS 2008's release for late-blooming features or components. In this article, I'll give you the details on those elements I had expected in VS 2008 that either didn't make the RTM cut-off date or were announced and then went into limbo: Entity Framework, multi-tier capabilities for LINQ to SQL, Project Jasper, LINQ to XSD, the VS 2005 Schema Designer, and a Plain Old XML wire format for Project Astoria, to name a few (see Table 1 for a summary list). I'll also cover where these technologies are now and when or if we're likely to see them in the future.

The Entity Framework
The Entity Framework (EF) and LINQ to Entities implementation will ship as a VS 2008 add-in either with or at about the same time as SQL Server 2008 in the first half of 2008. The ADO.NET team originally scheduled EF, which implements the Entity Data Model (EDM), to ship with the VS 2008 RTM bits. EF is the enterprise-grade successor to LINQ to SQL and the data access layer (DAL) for ADO.NET Data Services (see "VS 2008: The Road Ahead") and the Project Jasper "incubator project." EF is considered by most .NET developers to be the replacement for ObjectSpaces, which fell into the black hole of WinFS and went missing from VS 2005 during its beta period.

Like LINQ to SQL, EF is an object/relational modeling (O/RM) tool and persistence layer for relational databases. Unlike LINQ to SQL, which is limited to a 1:1 relationship between relational tables and CLR entity classes and supports only SQL Server 200x/Compact Edition as its persistence store, EF provides sophisticated mapping capabilities and has a provider model that ultimately will connect to multiple databases. EF uses three code-named XML schema files for relational-to-object mapping: SSDL for the physical layer that represents the database, MSL for the mapping layer, and CSDL for the conceptual layer that defines the EDM. The conceptual layer isolates the EDM from database schema changes and enables creating a viable EDM from legacy databases that don't observe relational niceties. The EDM Designer is a graphical mapping tool (similar to the LINQ to SQL designer) that simplifies the mapping process (see Figure 1). IBM Corp., MySQL AB, and Oracle Corp. have EF data providers in development, and IBM has demonstrated its data provider for DB2 and Informix. EF supports three types of inheritance: table per hierarchy, table per type, and table per concrete type. LINQ to SQL only supports table per hierarchy inheritance with a discriminator column.

EF has its own SQL dialect, Entity SQL or eSQL, which includes proprietary extensions to support queries against a conceptual layer that substitutes associations between objects for relations between tables. Entity SQL is strictly a data retrieval query language; version 1.0 doesn't include Data Management Language (DML) commands for INSERT, UPDATE, or DELETE operations. (Entity SQL has no relationship to ObjectSpaces' OPath query languages, which were based on XPath.) A DataReader returns query resultsets in a firehose cursor unless you employ EF's Object Services to populate EDM entity collections. LINQ to Entities translates strongly typed, compiler-checked queries in source code to Entity SQL for execution and mapping to CLR object collections.

Entity Framework architect Mike Pizzo said in an April 28, 2007 blog post: "[W]e have decided to ship the ADO.NET Entity Framework and Tools during the first half of 2008 as an update to the Orcas release of the .NET Framework and Visual Studio." The shipping delay hasn't diminished early adopters' interest in EF; there's been at least one EF presentation at every major .NET developer-oriented conference in 2007.

Multi-Tier LINQ to SQL
According to an Oct. 15, 2007 blog post by LINQ to SQL Program Manager Dinesh Kulkarni, LINQ to SQL doesn't have an "out-of-the box multi-tier story." LINQ to SQL's top-level object, the DataContext, isn't serializable and WCF 3.5's DataContextSerializer refuses to encode many:one entity associations (EntityRefs) because of a cyclic relationship problem that requires use of XML IDs and REFs. This means that LINQ to SQL doesn't support cross-process remoting with binary serialization or service-oriented architecture (SOA) that uses SOAP, JavaScript Object Notation (JSON), or Plain Old XML (POX) as the wire protocol. Substituting the NetDataContractSerializer for the DataContractSerializer enables bidirectional encoding of object graphs with cyclic relationships. However, inserting a custom SerializerOperationBehavior instance into the Windows Communication Foundation (WCF) message-processing chain requires adding semi-documented, hand-written code to the service and its client, as described here.

Lack of multi-tier capabilities doesn't mean that LINQ to SQL is a non-starter. LINQ to SQL is a great rapid application development (RAD) tool and lets relationally oriented .NET programmers get up to speed quickly with object-oriented business entities. Inability to cross physical machine boundaries doesn't prevent developers from implementing DALs that provide separation of concerns. For example, if you use ASP.NET's Object Data Source control, you can make the DataContext class private so that the presentation layer doesn't have access to sensitive property values such as DataContext.ConnectionString. Using the more versatile LinqDataSouce control, which supports server-side paging and sorting, requires a direct connection to the DataContext instance.

In Spring 2007, LINQ pilgrims were optimistic that LINQ to SQL would support WCF-based remoting. Said Matt Warren, SOA LINQ to SQL's head technical guru, in an April 12, 2007, post in the LINQ Project General Forum:

There is more work coming that makes 3-tier scenarios easier. The re-attach and playback API's are included because they are the bare minimum required to make any reconnecting work at all, however cumbersome it may seem at this time. There is an upcoming feature that automatically databinds LINQ to SQL objects directly to webforms that use these API's to communicate changes back into the DataContext. There is also a technology in the works that automatically serializes LINQ to SQL objects to other tiers, change tracks and data binds the objects for you on that tier, and serializes the objects back, solving the cyclic reference serialization problem and the indescribable schema problem for objects with change sets; which seems to match what many of you are describing.

He went on to explain:

The mechanism that does the change tracking on the client is similar to a mini-connectionless DataContext. It is a type that packages up all the objects, lists of objects, and graphs that you want to send to the client. It serializes itself and everything you've given it automatically. (It implements IXmlSerializable.) On the client, the same class also manages change-tracking for all the objects that were serialized with it. When serialized again (on the trip back) it serializes both the objects and their change information that it logged. Back on the middle tier, you just make one call to re-attach the whole package to a new DataContext instance and then call SubmitChanges.

Of course, this is what everyone wanted, not only for crossing process boundaries, but also for mocking the connected DataContext for unit testing and Test-Driven Development, as mentioned here. Unfortunately, on June 19, 2007, Matt Warren's answer became: "No, this feature will not be in the first release of LINQ to SQL. … It is only missing because of lack of time. When I mentioned it, I was talking about our forward thinking and a sample I'm trying to put together." It's possible to subclass a DataContext on the client side without a connection but you can't avoid an exception when you execute a LINQ to SQL query to populate a DataContext.Table<TEntity> object. I'm still hoping that Matt Warren will finish his sample.

LINQ to SQL had several other features go missing as well.

Not least among these items is the fact that prospective users of LINQ to SQL wanted a more granular approach to schema changes in the underlying database than regenerating all the partial classes and their methods.

Also, support for value objects would have been nice. Martin Fowler defines value types as "a small, simple object like money or a date range, whose equality isn't based on identity." An equally common value object or value type is an address. C# implements value types as structs and VB as user-defined types, but LINQ to SQL version 1 doesn't support them. EF is expected to support value objects as "complex types" in a future community technology preview (CTP).

Finally, LINQ to SQL lacks many-to-many associations without linking entities. EF supports hiding the entity that represents the linking table required to establish a many:many relationship in a relational database; LINQ to SQL doesn't. However, EF hides foreign key values as navigation properties, so it's difficult to reconstruct object associations from serialized entities with client-side code.

Project Jasper
Project Jasper was an incubation project based on EF that was introduced along with Project Astoria (now ADO.NET Data Services) at the MIX07 conference. The Jasper May 2007 CTP let you use VB 9.0 or IronPython to generate a default EDM mapping automatically for the database in a connection string. (VB 9.0 and IronPython are the only languages supported because late binding is required by the Jasper engine.) Project Jasper with VB 9.0 also simplified automatic data binding by eliminating source-code generation and enabling LINQ to Entities queries. The goal of Jasper is to support rapid, iterative development of data-serving Web applications with an API designed specifically for dynamic languages.

The Project Jasper team didn't update the May 2007 CTP for Orcas beta 2. What's worse, installing the Jasper Project May 2007 CTP under Orcas beta 2 trashed the Orcas installation, and usually required reinstalling the operating system or virtual server image. Microsoft's Andrew Conrad, Jasper's development lead, responded with this comment to a blog post entitled, "Has Project Jasper Been Swallowed by a Black Hole?":

We have a beta 2 update to Jasper which we are using in house--but decided not to release this publicly due to the limited amount of feedback we got from folks wanting an update. … We could do a release for Orcas RTM, but need to hear that developers are interested in this.

You can read the entire post and comment here. On the basis of this, it seems to me that a productized version of Jasper isn't in the works.

Several other MIA technologies include LINQ to XSD for VS 2008 RTM, the XML Schema Designer, and Astoria's POX wire format.

LINQ to XSD is a Microsoft LINQ implementation that enables strongly typed LINQ to XML queries. Ralf Lämmel updated the November 2006 Preview Alpha 0.1 for the May 2006 LINQ CTP to preview alpha 0.2 for Orcas beta 1 on June 5, 2007, but there was no update for Orcas beta 2. Right before going to press, Microsoft's Shyam Pather announced at the XML 2007 conference in Boston that the XML Team intended to resuscitate the LINQ to XSD incubator project, as well as deliver LINQ to Stored XML for querying SQL Server 2005+'s XML data type. Read all about the two projects here.

What's Missing from XML
The XML Schema Designer of VS 2005 and earlier is missing from VS 2008 and no replacement is included in the RTM bits. It's doubtful whether many developers will miss the designer because it was difficult to use and offered few advanced XSD editing features.

The XML team released a CTP1 in August 2007 for a new XML Schema Editor that adds a Schema Explorer to the XML Editor. The Schema Explorer CTP provides a read-only tree view of XSD elements in a window adjacent to the XML Editor, but it doesn't include a graphic designer (see Figure 2). A graphical, read/write XSD designer is expected some time in 2008.

The initial serialization formats for ASP.NET Data Services (previously code-named "Astoria"--see "VS 2008: The Road Ahead") were POX, JSON, and a subset of the XML syntax for Resource Description Format (RDF/XML). The Astoria team quickly dropped RDF/XML due to lack of developer interest, added the ATOM Publishing Protocol (APP), and later substituted a proprietary Microsoft XML Infoset syntax called Web3S (Web Structured, Schema'd, and Searchable) for POX. In early November 2007, the team dropped Web3S but didn't restore POX as a supported format.

APP and JSON are reasonably well suited for updating entities, but it's likely that the overwhelming majority of Astoria clients will be retrieve-only. Astoria is built on WCF 3.5, which supports XML, Atom, and JSON serialization, so providing the original read-only XML payload format would deliver easy-to-parse XML to applications that need it without a significant additional developer investment.

That's about it for the list of what was discussed but fell out or never made it in. I would have appreciated it had Microsoft devoted more development resources to remoting, serialization, and WCF compatibility for LINQ to SQL and Entity Framework during their CTP and beta periods. Serious requests for user input on serialization requirements for EDM didn't begin until Daniel Simmons' Nov. 20, 2007, blog post on how deep EDM object graph serialization should go. The product teams eschewed bidirectional XML serialization for LINQ to SQL entities and serialization of any kind for EF EntitySets because of the lack of an "interoperable" solution. While I appreciate Microsoft's increasing support for widely recognized industry standards, the probability of an interoperable solution to XML serialization of object graphs with cyclic references that doesn't involve intellectual property licenses appears about zero to me. I'm not sanguine about the prospects for an option that exposes foreign key values in the serialized XML stream that would enable creating many:1 associations on the service client. I'm also disappointed that ADO.NET Data Services probably won't support POX as a wire service in its ultimate incarnation; ATOM and RSS seem to me to have far too much overhead for simple RESTful CRUD operations on "data in the cloud."

Despite my reservations about delayed and missing features in VS 2008, there's a sufficient amount of interesting, new and improved technologies in the product to provide article and book fodder until Visual Studio "Hawaii" (or whatever code name Microsoft assigns to VS 10) hits the streets (whenever that may be).

comments powered by Disqus


Subscribe on YouTube