Data Driver

Blog archive

MapR Bolsters Hadoop Distro with Apache Drill for SQL-Based Big Data Analytics

After stewarding the open source project from incubation to its new 1.0 release, MapR Technologies Inc. added Apache Drill for SQL-based Big Data analytics to its Apache Hadoop distribution.

The company -- one of the "big three" Hadoop vendors along with Hortonworks Inc. and Coudera Inc. -- this week announced the general availability of the open source Apache Drill 1.0 and its inclusion in the MapR Hadoop distribution.

Drill is a low-latency query engine based on ANSI SQL standards that facilitates self-service, interactive analytics at Big Data scales, including up to petabyte scale (1 PB is equal to 1 million GB). One of its key features is that it doesn't depend on traditional database schemas that describe how data is categorized. Discovering such schemas on the fly makes for quicker analytics, the company said.

MapR engineers including Jacques Nadeau and Steven Phillips have taken the lead on the open source project, which was incubated at the Apache Sofwtare Foundation (ASF) in September 2012 with the goal of wedding the familiar workings of relational databases with the huge new scalability demanded by the Big Data era and the agility of Hadoop systems and their heavy use of NoSQL databases.

"The project has been on the fast track in the last nine months since the developer preview in August 2014, delivering seven significant iterative releases, each adding exciting new features and most importantly, improving on the stability, scale and performance required for broader enterprise deployments," MapR exec Neeraja Rentachintala said in a blog post Tuesday.

The Apache Drill Project Timeline
[Click on image for larger view.] The Apache Drill Project Timeline (source: MapR Technologies Inc.)

In addition to SQL queries, the tool can work with varying types of data, including files, NoSQL databases and more complex types of data such as JSON and Parquet.

"Drill enables interactivity with data from both legacy transactional systems and new data sources, such as Internet of Things (IOT) sensors, Web click-streams and other semi-structured data, along with support for popular business intelligence (BI) and data visualization tools," MapR said in a news release. "Drill provides reliability and performance at Hadoop scale with integrated granular security and governance capabilities required for multi-tenant data lakes or enterprise data hubs."

Upcoming features planned for future editions of Drill include more functionality centered on JSON, SQL, complex data functions and new file formats, Rentachintala said.

Posted by David Ramel on 05/22/2015 at 5:44 AM


comments powered by Disqus

Featured

  • .NET Core Ranks High Among Frameworks in New Dev Survey

    .NET Core placed high in a web-dominated ranking of development frameworks published by CodinGame, which provides a tech hiring platform.

  • Here's a One-Stop Shop for .NET 5 Improvements

    Culled from reams of Microsoft documentation, here's a high-level summary of what's new for performance, networking, diagnostics and more, along with links to the nitty-gritty details for those wanting to dig in more.

  • Azure SQL Database Ranked Among Top 3 Databases of 2020

    Microsoft touted the inclusion of Azure SQL Database among the top three databases of 2020 in a popularity ranking by DB-Engines, which collects and manages information about database management systems, updating its lists monthly.

  • Time Tracker Says VS Code Is No. 1 Editor for Devs, Some Working 15+ Hours Per Day

    WakaTime, which does time tracking for programmers, released data for 2020 showing that Visual Studio Code is by far the top editor/IDE used by its coders, some of whom are hacking away for more than 15 hours per day.

Upcoming Events