Data Driver

Blog archive

Red Hat Goes All In On Big Data (Whatever That Is)

I tuned in to a Webcast earlier this week where Red Hat announced it was contributing its Hadoop plug-in to the open source Apache Hadoop community and totally embracing Big Data with an "open hybrid cloud" strategy. More on that later.

What I found really interesting was the response to an audience member who asked, "How do you define Big Data?"

Hmmm. Good question. It's one of the most over-hyped terms in the tech world today, but exactly what is it? Red Hat executive Ranga Rangachari provided the following:

So ... what we think of ... analysts have different ways to talk about this. You've heard some analysts talk about the four Vs, which is the volume, the velocity and a few other attributes to it. And, yes, that is one way to look at it, but I think our view of Big Data is, fundamentally I think, the underlying type of data, either semi-structured or unstructured. That's one way, at least, from a technology standpoint, which contrasts very much from your typical structured databases that people are used to over the last 20 years or so.

Huh?

Obviously, it's not that easy to define Big Data.

John K. Waters addressed the question a year ago:

While there's lots of talk about big data these days (a lot of talk), there currently is no good, authoritative definition of big data, according to Microsoft Regional Director and Visual Studio Magazine columnist Andrew Brust.

"It's still working itself out," Brust says. "Like any product in a good hype cycle, the malleability of the term is being used by people to suit their agendas. And that's okay; there's a definition evolving."

Wikipedia defines it as "collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications."

In other words, no one knows.

Anyway, Red Hat will open source it's Hadoop plug-in and jump on the Big Data bandwagon with it's vision of an open hybrid cloud application platform and infrastructure. Rangachari said it was designed to give companies the ability to create Big Data workloads on a public cloud and move them back and forth between their own private clouds, "without having to reprogram those applications." Red Hat said in a news release that many companies use public clouds such as Amazon Web Services for developing software, proving concepts and pre-production phases of projects that use Big Data. "Workloads are then moved to their private clouds to scale up the analytics with the larger data set," the company said.

The Red Hat Hadoop plug-in is part of Red Hat Storage, running on Linux, which is based on the GlusterFS distributed file system. It's provided as an alternative to the Hadoop Distributed File System, known for some technical limitations that Apache and other organizations have also addressed.

Rangachari said the path to the open hyrbrid cloud Big Data application platform will eventually incorporate an Apache Hive connector (now in preview), NoSQL/MongoDB Java interoperability and RESTful OData Web protocol access, in addition to its existing JBoss middleware.

He emphasized that the new cloud strategy will be woven throughout every Red Hat project, noting that "Big Data could be one of the killer apps for the open hybrid cloud."

When asked why Red Hat was contributing its Hadoop plug-in to Apache, Rangachari said the Apache Hadoop community was the "center of gravity" in the Hadoop world and that the move will provide developers with easier access to the plug-in from the same ecosystem. He also said the company expects that, rather than stopping innovation of the technology, the move to open source will actually contribute to more innovation.

So what exactly is Big Data. Please explain here in a comment or via e-mail. We'll all appreciate it.

Posted by David Ramel on 02/22/2013 at 1:15 PM


comments powered by Disqus

Featured

  • .NET for Apache Spark Debuts in Version 1.0

    The open source project .NET for Apache Spark has debuted in version 1.0, finally vaulting the C# and F# programming languages into Big Data first-class citizenship.

  • In-App Reviews Come to Xamarin.Forms Android

    Android is playing a little catch-up to iOS regarding in-app review functionality, just now coming tp Microsoft's Xamarin.Forms implementation.

  • C# Slides in Usage Ranking of Programming Languages

    "The fact that C# lost three places in the ranking of language communities during the last three years is mostly explained by its slower growth compared to C/C++ and PHP."

  • Telerik UI for Blazor Updated

    Progress announced an update to its Telerik UI for Blazor components, targeting Microsoft's open source Blazor framework that lets C# coders create web apps without having to rely upon JavaScript.

  • Infragistics Unveils UI Components for Blazor

    Infragistics, specializing in third-party UI/UX controls and tools, unveiled a new offering targeting Blazor, Microsoft's red-hot open source framework that allows for C#-based web development instead of traditional mainstay JavaScript.

Upcoming Events