Big Data and SQL Server: Disruption or Harmony?
By responding to potential threats with thoughtfulness, and a zeal to add value, SQL and Big Data could be big business for Redmond.
The SQL Server relational engine matured a long time ago. There have been advances, of course, in performance, fault tolerance and high availability, not mention encryption, compression and file-system integration. And, yes, there's been support for XML, geospatial data and even a service broker. But these have been improvements on the margins. All these features are new flavors of icing; the cake has stayed the same.
And yet, over the 20-year lifetime of SQL Server, Microsoft has continued to add value to the product, even as the core features have gone into maintenance mode. The big story here has been business intelligence: BI capabilities started in SQL Server 7 with OLAP Services, and have been expanded in meaningful ways with every subsequent release. It's been a great strategy and it still is. But will it keep working?
Core capabilities can't stay static forever; eventually disruption comes along. Today that disruption is here, coming from Big Data, Hadoop and its MapReduce distributed-computing approach. As strong as SQL Server BI capabilities are (and they're getting even stronger with SQL Server 2012), if Microsoft can't embrace Big Data technology, SQL Server could find itself in a position of desperation, going from underdog to contender, to retiree. What's a poor enterprise software company to do?
Build Bridges, Don't Burn Them
Microsoft could ignore Hadoop, but that would be foolish. It could try to build a competitor, which it almost did with the Microsoft Research Dryad project, but I fear not too many people would have come to that party. Microsoft could just adopt Hadoop, plain vanilla, but that would most likely be a race to the bottom, and it wouldn't even win. Really, Microsoft must mix Hadoop into its bag of tricks and do what it has always done best: take a raw technology and make it approachable to the enterprise. I can't be sure yet, but I think that's what Microsoft has done, and it has enhanced the value of Windows Azure in the process.
With code name "Project Isotope," Microsoft has taken the step of implementing Hadoop on Windows. It's a no-slouch effort, too: Microsoft's distribution of Hadoop is being developed in concert with Hortonworks, a startup company founded and staffed by many former Hadoop team members at Yahoo! (where the open source project began). But what Microsoft has also done is integrate Hadoop into its BI stack, and that may be one of the smartest moves it's made in quite some time.
Microsoft has created an Excel add-in for Hive, which provides a SQL-like abstraction over Hadoop and MapReduce. The add-in is based on an ODBC driver, which in turn is compatible with PowerPivot, so business users can do meaningful analysis on Big Data, on their own terms. And because the same engine that drives PowerPivot has been implemented inside Analysis Services in SQL Server 2012, that product has access to Hadoop now, too. With that, Microsoft has joined the Big Data and Enterprise BI worlds. It has also tied together SQL Server and Hadoop.
Simplify and Succeed
Andrew Brust is Research Director for Big Data and Analytics at Gigaom Research. Andrew is co-author of "Programming Microsoft SQL Server 2012" (Microsoft Press); an advisor to NYTECH, the New York Technology Council; co-moderator of Big On Data - New York's Data Intelligence Meetup; serves as Microsoft Regional Director and MVP; and is conference co-chair of Visual Studio Live!