Data Driver

Blog archive

SQL Encroaches on Big Data Turf

Remember when SQL developers felt threatened by Big Data? Relational database management systems were old-school relics that couldn't cope with the vast amounts of unstructured, disparate data. NoSQL was the future. You needed to get onboard with Hadoop and MapReduce, running on Linux.

Well, not anymore.

Maybe not ever, really. There is just too big of an installed base of SQL developers and systems for the two camps, Big Data and SQL, to have remained apart. Even four or five years ago the convergence was underway with Hive, a data warehouse system for Hadoop that uses "a SQL-like language called HiveQL."

That convergence seems to be rapidly accelerating. Microsoft has been helping out, of course, with PolyBase in its SQL Server 2012 Parallel Data Warehouse to enable SQL queries of Big Data and initiatives such as HDInsight and the Hortonworks Data Platform to get Big Data into the Windows ecosystem.

But Redmond has plenty of company. Just this week I had the opportunity to interview Web coding pioneer Lloyd Tabb about the subject when his new company, Looker Data Sciences Inc., announced a query-based business intelligence (BI) platform called Looker. "SQL and relational querying is the best way to ask questions of large related data sets," Tabb told me.

He should know what he's talking about. He was a database and languages architect at Borland in the earlier days of RDBMS and went on to build LiveWire, the first application server for the World Wide Web. He was later a principal engineer at Netscape where he was architect of Netscape Navigator Gold (later named Composer), the first WYSIWYG HTML editor, and the engineering lead for Netscape Communicator. He helped found Mozilla.org and later became a pioneer in crowdsourcing, just to name a few of his accomplishments.

Looker, according to the company, "uses a new modeling language, LookML, which enhances SQL for analytics so end-users can perform powerful analytics without needing to know how a query is written."

I asked Tabb about the use of SQL instead of NoSQL, Hadoop or other Big Data technologies associated with BI analytics, and he gave me a little history lesson.

"Back in the day conventional wisdom was that if you were going to create an application for a PC you had to write it in Assembly language," Tabb said. "Higher-level languages generated code that was too big and too slow. Later, conventional wisdom was that you couldn't build a 'real-applicaiton' in an agile language--it was too big and too slow.

"Hadoop was designed because at the time there were no SQL engines that could deal with data sets that large. Developers regressed to hand coding queries in MapReduce. Both SQL and C are still in use today because they are the best abstractions for the kinds of problems they solve."

Looking around, I see lots of other evidence pointing to the Borg-like assimilation of Big Data by SQL. A few weeks ago GigaOM explored the subject with an article titled "SQL is what's next for Hadoop: Here's who's doing it," and just yesterday a PluralSight course on the topic was announced, described as "An investigation into the convergence of relational SQL database technologies from several vendors and Big Data technologies like Apache Hadoop."

And there are plenty more similar things going on out there. So rest easy, SQL data developers, your future is still bright.

What do you think about the convergence of Big Data and SQL? Share your thoughts by commenting here or by e-mail.

Posted by David Ramel on 03/08/2013


comments powered by Disqus

Featured

  • Microsoft Revamps Fledgling AutoGen Framework for Agentic AI

    Only at v0.4, Microsoft's AutoGen framework for agentic AI -- the hottest new trend in AI development -- has already undergone a complete revamp, going to an asynchronous, event-driven architecture.

  • IDE Irony: Coding Errors Cause 'Critical' Vulnerability in Visual Studio

    In a larger-than-normal Patch Tuesday, Microsoft warned of a "critical" vulnerability in Visual Studio that should be fixed immediately if automatic patching isn't enabled, ironically caused by coding errors.

  • Building Blazor Applications

    A trio of Blazor experts will conduct a full-day workshop for devs to learn everything about the tech a a March developer conference in Las Vegas keynoted by Microsoft execs and featuring many Microsoft devs.

  • Gradient Boosting Regression Using C#

    Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the gradient boosting regression technique, where the goal is to predict a single numeric value. Compared to existing library implementations of gradient boosting regression, a from-scratch implementation allows much easier customization and integration with other .NET systems.

  • Microsoft Execs to Tackle AI and Cloud in Dev Conference Keynotes

    AI unsurprisingly is all over keynotes that Microsoft execs will helm to kick off the Visual Studio Live! developer conference in Las Vegas, March 10-14, which the company described as "a must-attend event."

Subscribe on YouTube