News

Two Big Data Companies Form Partnership

Cloudera and MongoDB also announce the MongoDB Connector for Hadoop.

Developers are increasingly focusing on Big Data development, and Hadoop is one of the most popular technologies for the .NET crowd.

That tent just got a little bigger, as Cloudera Inc. and MongoDB Inc. announced a strategic partnership to integrate their Big Data technologies.

Cloudera is one of the main players in the Big Data space, with a popular Hadoop distribution and accompanying bundled packages for enterprises. MongoDB is billed as the most widely used NoSQL database and is often used in Big Data analytics.

Details of the partnership were scant, though the companies said they had the lofty goal to "transform how organizations approach Big Data."

One immediate aspect of the partnership was the certification of the MongoDB Connector for Hadoop to be used with Cloudera Enterprise 5, the company's latest package that bundles the Cloudera Distribution Including Apache Hadoop (CDH) with a subscription to Cloudera Manager, a Hadoop administration tool, and technical support.

"Certifying MongoDB's Connector for Hadoop on Cloudera Enterprise 5 was our first step to help our joint customers," Kelly Stirman, director of product at MongoDB, said in an interview. "Resources are being invested even further to match our companies' joint vision, including building a tighter integration of the next version of the MongoDB Connector for Hadoop."

The companies said no further news on the partnership will be forthcoming until the June 24 kick-off of the MongoDB World conference in New York.

The MongoDB Connector for Hadoop plug-in lets developers more easily use MongoDB as an input source or output destination for processing Big Data with the open source Hadoop framework. The connector obviates the need to use custom code or cumbersome import/export scripts to move data around. It takes advantage of multi-core parallelism, has full integration with the Hadoop and Java Virtual Machine (JVM) ecosystems, is compatible with Amazon Elastic MapReduce, and can use local filesystems, Hadoop Distributed File System (HDFS), or S3 to read and write backup files, MongoDB said in a Webinar last year.

"The connector presents MongoDB as a Hadoop-compatible file system allowing a MapReduce job to read from MongoDB directly without first copying it to HDFS, thereby eliminating the need to move terabytes of data across the network," MongoDB said last year when the connector was updated. "MapReduce jobs can pass queries as filters, so avoiding the need to scan entire collections, and can also take advantage of MongoDB's rich indexing capabilities including geospatial, text-search, array, compound and sparse indexes."

Beyond the connector, the companies hinted at more to come regarding the collaboration. "More than a simple technology integration, the partnership brings the two companies' leadership to bear in enabling enterprises to fundamentally rethink how data can be shared and put to work across the enterprise," they said in a statement. "The combination of Cloudera Enterprise and MongoDB will enable customers to easily develop, operate and manage Big Data infrastructure that powers modern applications."

The announcement was the latest in a string regarding MongoDB, which earlier this month released a major upgrade of its database. Last week, Microsoft announced new high-memory MongoDB instances were available in its Microsoft Azure cloud platform.

About the Author

David Ramel is an editor and writer for Converge360.

comments powered by Disqus

Featured

  • .NET Core Ranks High Among Frameworks in New Dev Survey

    .NET Core placed high in a web-dominated ranking of development frameworks published by CodinGame, which provides a tech hiring platform.

  • Here's a One-Stop Shop for .NET 5 Improvements

    Culled from reams of Microsoft documentation, here's a high-level summary of what's new for performance, networking, diagnostics and more, along with links to the nitty-gritty details for those wanting to dig in more.

  • Azure SQL Database Ranked Among Top 3 Databases of 2020

    Microsoft touted the inclusion of Azure SQL Database among the top three databases of 2020 in a popularity ranking by DB-Engines, which collects and manages information about database management systems, updating its lists monthly.

  • Time Tracker Says VS Code Is No. 1 Editor for Devs, Some Working 15+ Hours Per Day

    WakaTime, which does time tracking for programmers, released data for 2020 showing that Visual Studio Code is by far the top editor/IDE used by its coders, some of whom are hacking away for more than 15 hours per day.

Upcoming Events