Eco System

Eco System

Big Data ecosystem is evolving at a very rapid pace and it's difficult to keep track of the changes. The ecosystem provides a lot of choices (open source vs proprietary, free vs commercial, batch vs streaming). For a new-bee, it not only takes good amount of time and effort to get familiar with a framework, but it's also perplexing where to start.

Hadoop has got a lot of attention and many start with Hadoop, but Hadoop is not the solution for everything. Let's take graph processing, Hama and Giraph (though in incubating) are better then Hadoop for it. This page attempts to give an idea of the ecosystem around Big Data.

Here are some of the useful articles/blogs to get started with the Hadoop ecosystem.
 
Sqoop

HBase

Giraph

Oozie

Flume

Pig

Popular posts from this blog

Architectural Patterns for Near Real-Time Data Processing with Apache Hadoop

INTEGRATE SPARKR AND R FOR BETTER DATA SCIENCE WORKFLOW

How-to: Ingest Email into Apache Hadoop in Real Time for Analysis