Redmond Review

Big Data and SQL Server: Disruption or Harmony?

By responding to potential threats with thoughtfulness, and a zeal to add value, SQL and Big Data could be big business for Redmond.

The SQL Server relational engine matured a long time ago. There have been advances, of course, in performance, fault tolerance and high availability, not mention encryption, compression and file-system integration. And, yes, there's been support for XML, geospatial data and even a service broker. But these have been improvements on the margins. All these features are new flavors of icing; the cake has stayed the same.

And yet, over the 20-year lifetime of SQL Server, Microsoft has continued to add value to the product, even as the core features have gone into maintenance mode. The big story here has been business intelligence: BI capabilities started in SQL Server 7 with OLAP Services, and have been expanded in meaningful ways with every subsequent release. It's been a great strategy and it still is. But will it keep working?

Core capabilities can't stay static forever; eventually disruption comes along. Today that disruption is here, coming from Big Data, Hadoop and its MapReduce distributed-computing approach. As strong as SQL Server BI capabilities are (and they're getting even stronger with SQL Server 2012), if Microsoft can't embrace Big Data technology, SQL Server could find itself in a position of desperation, going from underdog to contender, to retiree. What's a poor enterprise software company to do?

Build Bridges, Don't Burn Them
Microsoft could ignore Hadoop, but that would be foolish. It could try to build a competitor, which it almost did with the Microsoft Research Dryad project, but I fear not too many people would have come to that party. Microsoft could just adopt Hadoop, plain vanilla, but that would most likely be a race to the bottom, and it wouldn't even win. Really, Microsoft must mix Hadoop into its bag of tricks and do what it has always done best: take a raw technology and make it approachable to the enterprise. I can't be sure yet, but I think that's what Microsoft has done, and it has enhanced the value of Windows Azure in the process.

With code name "Project Isotope," Microsoft has taken the step of implementing Hadoop on Windows. It's a no-slouch effort, too: Microsoft's distribution of Hadoop is being developed in concert with Hortonworks, a startup company founded and staffed by many former Hadoop team members at Yahoo! (where the open source project began). But what Microsoft has also done is integrate Hadoop into its BI stack, and that may be one of the smartest moves it's made in quite some time.

Microsoft has created an Excel add-in for Hive, which provides a SQL-like abstraction over Hadoop and MapReduce. The add-in is based on an ODBC driver, which in turn is compatible with PowerPivot, so business users can do meaningful analysis on Big Data, on their own terms. And because the same engine that drives PowerPivot has been implemented inside Analysis Services in SQL Server 2012, that product has access to Hadoop now, too. With that, Microsoft has joined the Big Data and Enterprise BI worlds. It has also tied together SQL Server and Hadoop.

Simplify and Succeed
Hadoop runs on Windows Server and on Windows Azure, with an installer that makes setup really easy. In addition to those configurations, Microsoft allows Windows Azure users to provision an entire Hadoop cluster from a Web portal, without any discrete installation steps at all. Once the cluster's up, customers can connect to its head node via Remote Desktop. The full Java-based command-line personality of Hadoop is there if you want it, but there are also Hive and JavaScript consoles in the browser. And then it only takes a minute or two to build out a connection from Microsoft BI tools or Excel, putting Hadoop to work for users fitting a number of profiles. It's classic Microsoft, mixed with equal parts open source and Java.

I've said before that Microsoft does some of its best work when it embraces standards from outside the company. That's what happened with jQuery and ASP.NET, and it may well be what happens with HTML5 and JavaScript in Windows 8. In the case of implementing Hadoop, Microsoft makes Windows Azure more valuable and more agnostic. It brings continued relevance to Windows Server and SQL Server. And it widens the reach and utility of Hadoop. By responding to a potential threat with thoughtfulness -- and a zeal to add value -- Big Data could be big business for Redmond.

About the Author

Andrew Brust is Research Director for Big Data and Analytics at Gigaom Research. Andrew is co-author of "Programming Microsoft SQL Server 2012" (Microsoft Press); an advisor to NYTECH, the New York Technology Council; co-moderator of Big On Data - New York's Data Intelligence Meetup; serves as Microsoft Regional Director and MVP; and is conference co-chair of Visual Studio Live!

comments powered by Disqus

Featured

  • VS Code 1.125 Adds Copilot Spend Meter After Billing Shock

    VS Code 1.125 adds in-editor visibility into additional Copilot budget usage as GitHub's AI-credit billing model continues to draw developer scrutiny.

  • TypeScript 7.0 RC Moves Microsoft's Go Rewrite Into the Mainline Compiler

    Microsoft's Go-based TypeScript rewrite has reached Release Candidate status, moving from a separate native-preview package into the regular TypeScript npm package while leaving some ecosystem-facing API work for TypeScript 7.1 or later.

  • Microsoft Highlights Visual Studio Live! Event Lineup and Longtime Developer Community Role

    A Microsoft MVP Blog post on Visual Studio Live!'s longevity arrives as the 2026 conference series continues with upcoming stops at Microsoft HQ, San Diego and Orlando.

  • Using Local AI to Cut Copilot Usage-Based Billing Shock

    After being gobsmacked by the new billing plan using almost all my monthly credits in one or two days, I tried pushing some Copilot-style coding work onto local models in VS Code. What I found was less "free AI" and more "pick your pain": cloud charges on one side, heavy local resource use and long waits on the other.

Subscribe on YouTube