News

HDInsight Gets Hadoop Upgrade

Microsoft today announced its cloud-based Hadoop service, HDInsight, now supports Hadoop 2.4, the latest version of the Big Data software.

Unveiled in October 2012, HDInsight exemplifies Microsoft's embrace of the Big Data movement and -- more generally -- its increasing involvement in open source technologies of all kinds. Microsoft partners with Hadoop heavyweight Hortonworks Inc. to provide the 100 percent Hadoop-compatible service on its Microsoft Azure platform, based on the Hortonworks Hadoop distribution.

Apache Hadoop 2.4, the latest update of the open source framework that's synonymous with Big Data, was released in April with enhancements to the often-criticized Hadoop Distributed File System (HDFS). The latest release also includes improvements to YARN -- sometimes referred to as "yet another resource negotiator" -- which is also described as the successor to the even-more-criticized MapReduce technology, a key component of the original Hadoop ecosystem. Various industry efforts aim to improve upon the constraints of the batch-oriented MapReduce with more modern analytics features such interactive queries on streaming data. YARN offers more interaction patterns with HDFS data and provides a more generalized processing platform beyond the MapReduce technology.

"This update includes interactive querying with Hive using advancements based on SQL Server technology, which we are also contributing back to the Hadoop ecosystem through project Stinger," Microsoft said in an announcement on the SQL Server Blog. "With this update to HDInsight, customers can use the speed and scale of the cloud to gain a 100x response time improvement."

Hive is a Hadoop-based data warehousing project also under the auspices of the Apache Software Foundation that allows data queries with its own SQL-like language. Stinger is a community project shepherded by Hortonworks to improve upon Hive with faster performance, increased scale and broader SQL support.

As noted by Oliver Chiu on the Microsoft Azure Blog, HDInsight is also getting an easy-to-use Web UI, letting developers graphically query Hive data.

The SQL Server team used the HDInsight announcement to highlight Microsoft's growing interaction with the open source community.

HDInsight clusters and Azure Blob Storage
[Click on image for larger view.] HDInsight Clusters and Azure Blob Storage
(source: Microsoft)

"We have fully embraced the Hadoop ecosystem and have prioritized contributing back to the community and Apache Hadoop-related projects, for example, Tez, Stinger and Hive," the post said. "All told, we've contributed 30,000 lines of code and put in 10,000-plus engineering hours to support these projects, including the porting of Hadoop to Windows. We've done this in partnership with Hortonworks, a relationship that ensures our Hadoop solutions are based on compatible implementations of Hadoop. One of the results of that partnership is the engineering work that has led to the Hortonworks Data Platform for Windows and Azure HDInsight."

The news came during the ongoing Hadoop Summit, at which T. K. Rengarajan, Microsoft corporate vice president of Data Platform, delivered the keynote address today.

About the Author

David Ramel is an editor and writer at Converge 360.

comments powered by Disqus

Featured

  • New 'Visual Studio Hub' 1-Stop-Shop for GitHub Copilot Resources, More

    Unsurprisingly, GitHub Copilot resources are front-and-center in Microsoft's new Visual Studio Hub, a one-stop-shop for all things concerning your favorite IDE.

  • Mastering Blazor Authentication and Authorization

    At the Visual Studio Live! @ Microsoft HQ developer conference set for August, Rockford Lhotka will explain the ins and outs of authentication across Blazor Server, WebAssembly, and .NET MAUI Hybrid apps, and show how to use identity and claims to customize application behavior through fine-grained authorization.

  • Linear Support Vector Regression from Scratch Using C# with Evolutionary Training

    Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the linear support vector regression (linear SVR) technique, where the goal is to predict a single numeric value. A linear SVR model uses an unusual error/loss function and cannot be trained using standard simple techniques, and so evolutionary optimization training is used.

  • Low-Code Report Says AI Will Enhance, Not Replace DIY Dev Tools

    Along with replacing software developers and possibly killing humanity, advanced AI is seen by many as a death knell for the do-it-yourself, low-code/no-code tooling industry, but a new report belies that notion.

  • Vibe Coding with Latest Visual Studio Preview

    Microsoft's latest Visual Studio preview facilitates "vibe coding," where developers mainly use GitHub Copilot AI to do all the programming in accordance with spoken or typed instructions.

Subscribe on YouTube