News

HDInsight Gets Hadoop Upgrade

Microsoft today announced its cloud-based Hadoop service, HDInsight, now supports Hadoop 2.4, the latest version of the Big Data software.

Unveiled in October 2012, HDInsight exemplifies Microsoft's embrace of the Big Data movement and -- more generally -- its increasing involvement in open source technologies of all kinds. Microsoft partners with Hadoop heavyweight Hortonworks Inc. to provide the 100 percent Hadoop-compatible service on its Microsoft Azure platform, based on the Hortonworks Hadoop distribution.

Apache Hadoop 2.4, the latest update of the open source framework that's synonymous with Big Data, was released in April with enhancements to the often-criticized Hadoop Distributed File System (HDFS). The latest release also includes improvements to YARN -- sometimes referred to as "yet another resource negotiator" -- which is also described as the successor to the even-more-criticized MapReduce technology, a key component of the original Hadoop ecosystem. Various industry efforts aim to improve upon the constraints of the batch-oriented MapReduce with more modern analytics features such interactive queries on streaming data. YARN offers more interaction patterns with HDFS data and provides a more generalized processing platform beyond the MapReduce technology.

"This update includes interactive querying with Hive using advancements based on SQL Server technology, which we are also contributing back to the Hadoop ecosystem through project Stinger," Microsoft said in an announcement on the SQL Server Blog. "With this update to HDInsight, customers can use the speed and scale of the cloud to gain a 100x response time improvement."

Hive is a Hadoop-based data warehousing project also under the auspices of the Apache Software Foundation that allows data queries with its own SQL-like language. Stinger is a community project shepherded by Hortonworks to improve upon Hive with faster performance, increased scale and broader SQL support.

As noted by Oliver Chiu on the Microsoft Azure Blog, HDInsight is also getting an easy-to-use Web UI, letting developers graphically query Hive data.

The SQL Server team used the HDInsight announcement to highlight Microsoft's growing interaction with the open source community.

HDInsight clusters and Azure Blob Storage
[Click on image for larger view.] HDInsight Clusters and Azure Blob Storage
(source: Microsoft)

"We have fully embraced the Hadoop ecosystem and have prioritized contributing back to the community and Apache Hadoop-related projects, for example, Tez, Stinger and Hive," the post said. "All told, we've contributed 30,000 lines of code and put in 10,000-plus engineering hours to support these projects, including the porting of Hadoop to Windows. We've done this in partnership with Hortonworks, a relationship that ensures our Hadoop solutions are based on compatible implementations of Hadoop. One of the results of that partnership is the engineering work that has led to the Hortonworks Data Platform for Windows and Azure HDInsight."

The news came during the ongoing Hadoop Summit, at which T. K. Rengarajan, Microsoft corporate vice president of Data Platform, delivered the keynote address today.

About the Author

David Ramel is an editor and writer at Converge 360.

comments powered by Disqus

Featured

  • Compare New GitHub Copilot Free Plan for Visual Studio/VS Code to Paid Plans

    The free plan restricts the number of completions, chat requests and access to AI models, being suitable for occasional users and small projects.

  • Diving Deep into .NET MAUI

    Ever since someone figured out that fiddling bits results in source code, developers have sought one codebase for all types of apps on all platforms, with Microsoft's latest attempt to further that effort being .NET MAUI.

  • Copilot AI Boosts Abound in New VS Code v1.96

    Microsoft improved on its new "Copilot Edit" functionality in the latest release of Visual Studio Code, v1.96, its open-source based code editor that has become the most popular in the world according to many surveys.

  • AdaBoost Regression Using C#

    Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the AdaBoost.R2 algorithm for regression problems (where the goal is to predict a single numeric value). The implementation follows the original source research paper closely, so you can use it as a guide for customization for specific scenarios.

  • Versioning and Documenting ASP.NET Core Services

    Building an API with ASP.NET Core is only half the job. If your API is going to live more than one release cycle, you're going to need to version it. If you have other people building clients for it, you're going to need to document it.

Subscribe on YouTube