News

Microsoft Adds HBase Preview to HDInsight Big Data Cloud Service

The cloud-based Big Data service, Microsoft Azure HDInsight, now supports Apache HBase clusters.

Microsoft announced a technology preview of the Hadoop component just days after announcing HDInsight had been upgraded to the latest Hadoop release, version 2.4.

HBase is a non-relational distributed database technology running on top of the Hadoop Distributed File System (HDFS). The NoSQL (for "not only SQL) database open source technology is similar to the Bigtable project from Google Research.

"Use Apache HBase when you need random, real-time read/write access to your Big Data," the project's site says. "This project's goal is the hosting of very large tables -- billions of rows x millions of columns -- atop clusters of commodity hardware. Apache HBase is an open source, distributed, versioned, non-relational database modeled after Google's Bigtable: A Distributed Storage System for Structured Data by Chang et al. Just as Bigtable leverages the distributed data storage provided by the Google File System, Apache HBase provides Bigtable-like capabilities on top of Hadoop and HDFS."

The columnar, low-latency database can do online transaction processing (OLTP) functions such as updates, inserts and deletes of data in Hadoop, Microsoft said in a Friday announcement. HBase uses a set of tables that contain rows and column families that developers must define ahead of time, Microsoft said, but it's flexible because new columns can be added anytime to the column families. This gives HBase more schema flexibility to adapt to changing requirements quickly.

"This preview announcement will enable customers to run HBase as a managed cluster in the cloud (as an integrated feature of Azure HDInsight)," Microsoft said. "The HBase clusters are configured to store data directly in Azure Blob storage."

The use cases allowed by this, Microsoft said, include the building of interactive Web sites based on large Azure Blob datasets. Another example: "Building services that store sensor and telemetry data from millions of endpoints in Azure Blobs (which can then be analyzed using HDInsight (Hadoop)."

About the Author

David Ramel is an editor and writer at Converge 360.

comments powered by Disqus

Featured

  • Compare New GitHub Copilot Free Plan for Visual Studio/VS Code to Paid Plans

    The free plan restricts the number of completions, chat requests and access to AI models, being suitable for occasional users and small projects.

  • Diving Deep into .NET MAUI

    Ever since someone figured out that fiddling bits results in source code, developers have sought one codebase for all types of apps on all platforms, with Microsoft's latest attempt to further that effort being .NET MAUI.

  • Copilot AI Boosts Abound in New VS Code v1.96

    Microsoft improved on its new "Copilot Edit" functionality in the latest release of Visual Studio Code, v1.96, its open-source based code editor that has become the most popular in the world according to many surveys.

  • AdaBoost Regression Using C#

    Dr. James McCaffrey from Microsoft Research presents a complete end-to-end demonstration of the AdaBoost.R2 algorithm for regression problems (where the goal is to predict a single numeric value). The implementation follows the original source research paper closely, so you can use it as a guide for customization for specific scenarios.

  • Versioning and Documenting ASP.NET Core Services

    Building an API with ASP.NET Core is only half the job. If your API is going to live more than one release cycle, you're going to need to version it. If you have other people building clients for it, you're going to need to document it.

Subscribe on YouTube