VSM Cover Story
Inside SQL Server 2008
SQL Server 2008 isn't revolutionary by any stretch, but it includes a slew of new features that probably make this a no-brainer upgrade for most developers who rely on previous versions of SQL Server.
TECHNOLOGY TOOLBOX: SQL Server 2008
Microsoft released to manufacturing SQL Server 2008 in August, a mere three years after the previous release. I know some say that three years is too short a period to release a major upgrade to server software, but many people, including me, are glad to have the enhancements sooner than later.
SQL Server 2008 is not a fundamental rewrite of its predecessors. Instead, Microsoft has built upon what's good and solid in SQL Server 2005 and added some exciting new productivity features and many general enhancements that make the software more useful for supporting a wider variety of applications. Some have dubbed this release the largest Service Pack Microsoft has ever released. I'm not sure if that's intended as a compliment or an insult, but I think it's a good thing. SQL Server is a maturing product, and neither I, nor my clients, have any desire to deal with brand-new server software that's critical to support business applications and processes every few years.
SQL Server is a server product and is mostly in the domain of system and database administrators. But it's also a repository for data used by applications, which also brings the product into the domain of developers. This new release has a lot of goodies that support application development directly.
I'll walk you through several of the new and improved features that I believe are most useful and interesting for developers, but you should also be sure to check out the blogs and other aspects of the larger online community that has built up around SQL Server (see "Learn More About SQL Server 2008").
Management Studio Gets Several Enhancements
Since SQL Server 2005, Management Studio has been an extended version of Visual Studio. In 2005, the implementation was useful but a bit half-baked. In SQL Server 2008, Microsoft has made Management Studio a worthy environment for both administrators and developers. (Members of both groups who prefer command-line interfaces can also use the extensible Windows PowerShell.) Many features will be familiar to users moving to SQL Server 2008, but there are also many new features that make working with SQL Server much easier.
For developers, probably the nicest new feature in Management Studio is IntelliSense. Long a staple in Visual Studio, IntelliSense lets you write code in a Query Editor Window and reduce the number of times you have to go to Books Online to look up syntax or spelunk Object Explorer to find the name of that stored procedure you need. You also get VS-style red squiggly lines that show syntax errors (see Figure 1). IntelliSense provides help with parameters for functions and stored procedures.
IntelliSense in SQL Server 2008 works largely as it does in VS, providing you with a list of objects and methods as you type code. But the implementation in SQL Server 2008 takes some getting used to, particularly if you've written a lot of LINQ code in VS 2008. For example, when you start typing a SELECT statement and hit the space after SELECT, you don't get a list of all the fields in every table in the database. But once you get to the FROM clause, you get some help both with the schema name and the table or view within that schema (see Figure 1). You need to get into the habit of using schemas explicitly, even if it's the DBO schema, to enable IntelliSense to help you write code.
The second most-coveted new feature in Management Studio is T-SQL debugging. You can now debug code directly from within Management Studio, which provides all of the features you expect, including the ability to step through code, view and change local variables, watch expressions, and set breakpoints. The Call Stack, Breakpoints, Command, and Output windows function much as they do in VS, providing good insight into what your code is doing and where it's going haywire.
Debugging in Management Studio is nowhere as deep as it is in VS, but it is functional and provides a lot of debugging tools. Of course, you can still enter a T-SQL debugging session from within VS, which means you can have the best of both worlds.
Object Explorer Window Becomes Useful
Management Studio's Object Explorer has long provided a nice view into the many persistent and virtual objects in a database and server. But the Object Explorer Details window, which by default appears to the right of Object Explorer when you first start Management Studio, was less than useless in SQL Server 2005. For the most part, it just displayed the same list of objects shown in Object Explorer. The tab took up space, and many users simply closed the window.
In SQL Server 2008, Object Explorer Details often provides useful information, such as when you select the Databases node in Object Explorer (see Figure 2). The views are highly customizable, letting you display exactly the information you find most useful. The information display is context sensitive for whatever you've selected in Object Explorer. For example, when you select the Tables node in any database, including system databases, you can see a quick overview of the characteristics of all tables, including the schema name, creation date, number of rows, and more. You can sort the data on any column and add or remove columns to display the information that interests you. Management Studio remembers your selections, so you'll always see the same information for that type of object.
You can also search for objects from the Objects Explorer Details window when you know only part of the name of an object or don't remember what database it's in. For example, you remember that an object has a name that includes the string EmployeeHireInfo, but you don't remember the database. You search using the % wildcard to get the list (see Figure 3). The search is context-sensitive, depending on what you have selected in Object Explorer. If you're within a database, the search is limited to that database; otherwise, it's across all databases.
Management Studio has a lot of other new features, and I discover more every day. Two more that I recently discovered include the ability to query multiple servers by defining a server group, and the ability to configure the number of rows returned when opening a table to select or edit its contents.
2008 Adds New T-SQL Data Types
Each new release of SQL Server always includes intriguing new T-SQL statements and functions to support new database server features that let developers do new things or improve on old ways of doing things.
One long-overdue addition is SQL Server 2008's new date and time types. Developers have long had to struggle with a couple of date and time types that tightly bind dates with times. That's fine when you need to record a specific time on a specific date, but it often means writing some gnarly T-SQL code to separate the two, such as when all you need to store is a time. The new DATE type is just a date, in the range of 0001-01-01 through 9999-12-31, which equates to Jan. 1, A.D. 1 through Dec. 31, A.D. 9999. That's not exactly a valid range of dates -- it straddles the changes that came with the adoption of the Gregorian calendar in 1582 -- but it provides a simple date type usable for most business applications. SQL Server stores this information in three bytes, a vast improvement over the DATETIME's eight bytes, which is a nice savings when you don't need the overhead of the time data.
The equivalent TIME type has a range of 00:00:00.0000000 through 23:59:59.9999999, stored in five bytes at the default 100 nanosecond precision. The new DATETIME2 type doesn't show much imagination in how it was named, but it's useful, providing a larger date range, a larger default fractional precision, and optional user-specified precision stored in six to eight bytes. A new DATETIMEOFFSET type includes time zone information embedded in the type. The date and time enhancements in SQL Server 2008 make it much easier to work with temporal data, and the database stores the data much more efficiently.
One of the sexiest new features in SQL Server 2008 is spatial data types. If you've ever worked with spatial data in a database, such as latitudes and longitudes or locations in a grid, you've probably developed your own types to support basic operations and conversions. It's not trivial code. But now SQL Server 2008 has built-in support for two kinds of spatial-data systems. The geometry types support planar or "flat-earth" coordinate data. The geography types store ellipsoidal data that stores locations on the earth's surface, a flattened sphere. Whether you're storing GPS data scattered around the globe or need to store the coordinates that define complex shapes on a rectangular surface, you'll find a lot of features in these data types, along with dozens of useful methods.
It's common to store hierarchical data in a database, even though relational databases don't support hierarchies easily. You can create hierarchies with self-joins, but you generally need to do all the work. SQL Server 2008 introduces the HierarchyID data type, which greatly simplifies working with hierarchical data, complete with functions that make it easy to navigate hierarchies. It doesn't make data hierarchies a substitute for the native structure of XML data, but it does simplify operations. SQL Server maintains the structure of the data, supports random insertions and deletions, and supports location-based comparisons. You can index the data either breadth-first or depth-first, depending on the nature of the data and how your applications access it.
T-SQL in SQL Server 2008 hasn't received any major changes, but the new version includes many features that make code simpler and more efficient. There are a few syntax enhancements that developers will like, including a couple that make T-SQL seem more like a "real" programming language. You can now declare and initialize variables in a single statement. As I alternate between .NET programming languages and T-SQL, my fingers have long tried to do this, rewarded with the inevitable syntax errors. But now this statement works:
DECLARE @id int = 5
Another nice new programming feature is compound operators. There's no rocket science here; this code prints "6":
DECLARE @id int = 5
SET @id += 1
T-SQL supports compound operators for addition, subtraction, multiplication, division, and modulus, as well as the bitwise operators.
Sparse columns provide an efficient storage mechanism for data that consists of many nulls. When defined using the SPARSE keyword on a column definition, any nulls in the column require no storage space at all. There are a few restrictions; for example, you can't use sparse columns with a few complex data types such as the spatial types, image and text fields, or user-defined types. But using sparse types requires substantially less disk storage. You can also use sparse columns with column sets, which is untyped XML data that combines all the sparse columns in the table. Column sets are useful when the table contains many sparse columns and when working with these columns individually could prove tedious.
One of my favorite new T-SQL features is table-valued parameters (TVPs). This one feature single-handedly will save you from a lot of ugly T-SQL code. Have you ever had to pass several pieces of data as a parameter to a stored procedure? Maybe it was a comma-delimited list or some other array-like structure. You'd have to write some nasty parsing code to split up the values, then probably use a loop to process the data. SQL Server 2005 introduced a table data type, but you couldn't pass it to a procedure.
TVPs solve these kinds of problems elegantly by letting you pass -- as the name suggests -- a table-valued parameter to the procedure or function. Then, in the body of the procedure, you can use the set-based features of SQL to process the data, such as by inserting it into a persistent table.
Here's a simple example of using a TVP. There are four simple steps. First, create a persistent TABLE type with the schema to hold the data:
CREATE TYPE MyTbl AS TABLE
(ID INT, String NVARCHAR(100))
Next, create a stored procedure that accepts a parameter of the TABLE type you just created. As with any stored procedures, you can have as many parameters as you like, and any or all of them can be TVPs:
CREATE PROCEDURE dkSelectFromTVP(@TVParam MyTbl READONLY)
SET NOCOUNT ON
SELECT * FROM @TVParam
Now declare a variable of your TABLE type and insert some data:
DECLARE @TVP AS MyTbl
INSERT INTO @TVP(ID, String)
VALUES (1, ‘Fairbanks')
INSERT INTO @TVP(ID, String)
VALUES (2, ‘Juneau')
INSERT INTO @TVP(ID, String)
VALUES (3, ‘Anchorage')
INSERT INTO @TVP(ID, String)
VALUES (4, ‘Denali')
Finally, run the stored procedure and pass in the TVP:
EXEC dkSelectFromTVP @TVP
ADO.NET 3.5 includes support for TVPs and many of the other enhancements in SQL Server. You can pass a TVP to a stored procedure from ADO.NET code as a DataTable, DbDataReader, or IList<SqlData Record> from within SQLCLR code.
TVPs have a few restrictions; the worst of these is that the parameter is read-only. If you leave off the READONLY keyword in the stored-procedure definition, you'll get a syntax error. The SQL community is exerting some pressure on Microsoft to ease this restriction, so perhaps a future version will make TVPs even more flexible. But now, whenever you need to pass multiple pieces of data in a single parameter, you won't have to write messy parsing code to break it up.
A database scenario that developers often need to write code for involves taking a set of source data, updating a table based on that source data, and then applying changes as necessary. You might have to update some existing rows, insert new rows, or delete rows that are no longer needed. This can be tortuous T-SQL code to write, often requiring a dreaded cursor, which is a guaranteed drain on performance.
SQL Server 2008 simplifies this kind of scenario with the MERGE statement. MERGE lets you define the source and target data, as well as the operations performed on the target data based on whether the source data matches the target data (using match definitions you specify). Assume you have a table with data about a set of working dogs, with fields such as Name, BirthDate, and HarnessSize. You gather a set of changes from a user and need to apply the changes to the Dogs table. You might do this by creating a TABLE type named typDogUpdates to store the source update data that the code will apply to the dbo.Dogs table (see Listing 1). You can then use the MERGE statement to create a TABLE variable of that type and populate it with data. In this case, the primary key dogID will be the basis for matching source and target data, but you can use any field or combination of fields you want. If SQL Server finds a matching dogID, it updates the row; otherwise, it inserts the record.
The MERGE statement consists of five main clauses. The MERGE clause specifies the target for the changes -- in this case, the persistent table Dogs. The USING clause specifies the source data that contains the changes you want to apply to the target. The ON clause specifies the join condition, how the source data is matched to the target data. One or more WHEN clauses let you specify what to do when the database finds matches and what to do when source data doesn't match anything in the target. In Listing 1, the code executes an UPDATE statement when there is a match. When the source data doesn't match a row in the target, it executes an INSERT statement. You can also delete data and apply various conditions to each action. Finally, the OUTPUT clause can return a result set that contains records for each row changed in the target data.
Now take a look at the Dogs table before running the MERGE statement on it, the output results, and the new state of Dogs (see Figure 4).
The INSERT statements in Listing 1 also demonstrate another great new T-SQL feature: row constructors, which define multiple rows in a single statement. Note how the first INSERT statement in the listing creates multiple rows in a single T-SQL statement with the VALUES keyword. You're limited to creating 1,000 rows this way, but this beats writing out a bunch of full INSERT statements. By the way, Books Online doesn't use the term "row constructors" (or anything else for these), but that's becoming the common name for this feature.
There are many other great enhancements to T-SQL in SQL Server 2008. Grouping Sets are an extension to the GROUP BY clause and provide functionality similar to ROLLUP and CUBE to define multiple groupings in a single query. You now have far more options for working with unstructured data, including enhancements to the XML data type and FILESTREAM, which lets you store blobs in the file system while letting SQL Server manage the files for you.
Administrator Features Help Developers
SQL Server 2008 is a server application, and most of its features are focused on making it robust no matter what kind of loads applications throw at it. It's chock-full of administrative-support features that make it incredibly easy to install, manage, and secure the database. Usually, there's a difference between the features that administrators and developers are interested in or use during the course of a typical day. But there are a handful of administrative features in SQL Server 2008 that are useful to developers.
One such feature is partition switching. Developers and admins have long used table partitions to store subsets of data in various tables, usually for performance or data storage reasons. A common scenario is to store each calendar year's worth of transactions in a separate table and put each of the tables in a different file group. You can create a UNION query to extract and summarize the data when you need to access all the data, such as to create a report that spans all time. This works, but it requires some work to set up and often requires modifying code when adding a new partitioned table when a new year begins. (There are lots of other ways to do this.) You can use this kind of scheme to archive old data while keeping it available for analysis.
With partition switching, you can add a table as a partition to another table that is already partitioned, remove partitioning to create a single aggregated table, and switch a partition from one partitioned table to another. You could always set up your own scheme to implement these features, but in SQL Server 2008 you can perform these tasks using the ALTER TABLE and ALTER
PARTITION statements. The data itself is not changed or moved. The only thing that changes is the metadata for where it's stored. There are a slew of requirements to make partition-switching work, but they basically boil down to the fact that all of the involved tables must be identical in nearly every way.
Full-text searching has long been a feature that held a lot of promise but never seemed to get traction. One of the reasons is that it always seemed like an add-in that wasn't fully implemented. But with SQL Server 2008, full-text search is completely integrated into the database instead of being stored externally. Portions of full-text indexing and querying are now integrated into the query optimizer, so performance is much better, and there are more tools to extract useful data from the database. You might want to consider dumping all that gnarly T-SQL code you wrote over the last decade to give users flexible searches into their data and implement full-text searches instead.
SQL Server 2008 supports Windows PowerShell, an enhanced, extensible scripting shell interface for developers and administrators who love the command line. SQL Server includes two PowerShell snap-ins that expose the hierarchy of database and server objects as paths (similar to file-system paths). On the surface, this sounds a bit like an abomination, but it can simplify getting around the database object model. Another snap-in implements a set of PowerShell cmdlets for performing a variety of actions such as running sqlcmd scripts. PowerShell is a powerful tool, but if you love your mice and GUIs, you can opt not to use it.
SQL Server 2008 offers a lot to love for a developer. It isn't a revolutionary release, but it has enough great features to make it a slam-dunk upgrade as soon as your neighborhood system and database administrator lets you.
Early reports suggest that Microsoft succeeded in making good performance enhancements along with improved scalability, so it should handle all your existing and new apps at least as well as SQL Server 2005 does. Throw in the productivity enhancements that make writing T-SQL code easier and faster, and this is probably a no-brainer upgrade.
There's also a ton of new stuff to learn in SQL Server 2008, though, so be careful to get up to speed on what's new and different. Some tools have disappeared, such as the
Surface Area Configuration tool, but that just means that the same functionality is located somewhere in some other tool (see "Missing in Action: Features Dropped from SQL Server 2008"). Be ready to do some hunting for these relocated features.