When To Use SQLCLR, And When Not To
Use these guidelines when deciding if SQLCLR objects are right for your data-intensive .NET 2.0 project.
It would be surprising if there were any .NET developers or SQL Server DBAs who aren't aware by now that SQL Server 2005 and its Express edition support an in-process implementation of the .NET 2.0 CLR. Early Microsoft whitepapers and marketing materials promoted the capability to execute managed code in the server process as the most significant new feature of the database that was then known as Yukon. Many of the first Web and print articles about Yukon encouraged SQL-challenged .NET developers to create database objects with VB or C# as an alternative to gaining proficiency with writing and executing T-SQL queries. A surprising number of the Web posts included a rhetorical question: Is T[ransact]-SQL Dead?
The jury's still out on the question of the importance of CLR integration to the ultimate commercial success of SQL Server 2005, either for new installations or as upgrades to SQL Server 2000. But the verdict is unanimous on developers' need for T-SQL competency. Regardless of their skill set for writing procedural VB or C# code, developers of data-intensive .NET projects must be able to author, execute, optimize, and debug complex T-SQL set-based code for traditional create-retrieve-update-delete (CRUD) operations. Developers in organizations where DBAs have total control of production and development databasesas well as write the T-SQL code for and manage all stored procedures and other database objectsmight represent exceptions. However, such an authoritarian scenario is uncommon in today's IT milieu.
The important questions for most data architects and developers of data-intensive .NET business layers or presentation tiers are if and when to replace traditional T-SQL statements with managed code. If your employer's or client's recalcitrant DBAs won't enable the CLR in "their" databases with the Surface Area Configuration for Features dialog, these questions are moot. Otherwise you'll probably find at least some of this article's guidelines will aid in making appropriate decisions for adding SQLCLR objects to production SQL Server 2005 databases.
Use T-SQLnot SQLCLRstored procedures or table-valued user-defined functions (TVFs) for set-based, relational operations. This opinion qualifies as conventional wisdom because it appears to be held by all recognized SQL experts. A more universaland probably more controversialrecommendation is to use T-SQL unless you have a compelling reason to substitute a SQLCLR object and a full understanding of the potential effects of executing the managed code on SQL Server performance and scalability. For example, if you're executing procedural operations on cursors in a T-SQL stored procedure or TVF, a corresponding SQLCLR object is likely to deliver better performance. The Yukon development team has simplified the code for SQLCLR TVFs dramatically in later Community Technical Previews (CTPs).
Replace extended stored procedures (XProcs), including OLE Automation stored procedures (sp_OAs), with SQLCLR objects. This recommendation is another slam-dunk. XProcs have a deservedly bad reputation for failing to integrate their execution with what Microsoft senior developer Chris Brumme calls "the unusual threading environment and resource rules that exist inside SQL Server." Chris also observes that "it takes superhuman knowledge and skill to develop a bug-free XProc." The sp_OA stored procedures are XProcs that make late-bound calls to COM objects. You can use the CLR's COM interop feature to provide early binding and improved performance, if you need to use the COM object. However, C# or VB code to replace XProcs and sp_OAs should execute in a middle tier or business layer if possible.
Don't move middle-tier data management or business layer code into the SQL Server process with SQLCLR objects unless your component requires large amounts of data, such as all or most of a large table's rows, to be moved over the network. In many cases you can minimize network traffic by using T-SQL to roll up numerical data instead of aggregating all rows in the middle tier. For example, if you're generating a time-series crosstab DataTable in a business layer, generate aggregate values with T-SQL SUM functions and date-based GROUP BY clauses or use the new PIVOT operator on the server. The number of cells of data you send to the middle tier is equal to, or if you add row and column totals, less than the number of cells in your crosstab DataTable.
Understand the differential cost of scaling up database servers vs. scaling out Web and application servers before you consume expensive SQL Server 2005 resources with SQLCLR data management objects. Microsoft's early and somewhat myopic vision of autonomous computing proposed getting close to the data by scaling out database servers in self-contained bricks. A brick is a low-cost, commodity server that runs the operating system (Windows Server 2003) and the relational database management system, or RDBMS (SQL Server 2005), and has local disk storage, high-speed I/O, and lots of RAM. The idea in 2003 was to combine multiple bricks into Autonomous Computing Cells (ACCs), which made up a Service Center (SC). Obviously, the cost of per-processor licenses for Windows Server 2003 and SQL Server 2005 is several times that of the "low-cost commodity server." Scaling out the middle tier onto low-cost Web blade servers running Windows 2003 Server Web Edition is far less costly than scaling up a database server to achieve similar performance improvement.
Take full advantage of SQL Server 2005's new performance counters and, especially, dynamic management views (DMVs) to monitor resource usage by SQLCLR objects. You can use sys.dm_os_memory_objects to monitor total server and CLR memory usage, sys.dm_os_wait_stats to display CLR wait statistics, sys.dm_exec_requests to show pending requests, sys.dm_exec_query_stats to gather query performance data, and sys.dm_os_performance_counters to collect CLR performance counter values. "The Database Administrator's Guide to the SQL Server Database Engine .NET Common Language Runtime Environment" whitepaper has examples of the required DMV syntax.
Don't underestimate the management costs of SQLCLR objects. When you or a DBA deploy SQLCLR objects, the database stores the .NET assembly but not its source code. SQL Server project source code and version control contribute to DBAs' training costs and management workload. A minor change to a SQLCLR user-defined data type (UDT) requires dropping and re-creating all references to the UDT column, and then the column and its data. Some organizations are now foregoing T-SQL stored procedures in favor of parameterized or dynamic T-SQL queries because of increased management costs. Future relational-to-object mapping APIs for the business layer, such as Language Integrated Query (LINQ) for .NET, currently rely on dynamic T-SQL for SELECT queries; LINQ supports stored procedures for INSERT, UPDATE, and DELETE operations, but there's no indication so far that the release version will support stored procedures for data retrieval.
SQL Server maven Kimberly Tripp said at 23:00 into her .NET Rocks! podcast, "The two most misused features of SQL Server 2005 will be a combination of CLR and XML" (see Resources). She's undoubtedly correct with respect to CLR abuse and possibly right about improper use of the XML data type. Kimberly closed her interview with this advice to SQL Server 2005 DBAs: "Everyone needs to be a jack-of-all-trades and a master of some. Find out about all the features, know what they are in general and if something looks interesting, learn about it as best you can so you can implement it properly." Developers of data-intensive .NET projects also are well-advised to heed Kimberly's advice.