UPDATES

WELCOME TO BIGDATATRENDZ      WELCOME TO CAMO      Architectural Patterns for Near Real-Time Data Processing with Apache Hadoop      Working with Apache Spark: Or, How I Learned to Stop Worrying and Love the Shuffle     

Hadoop Resources

HadoopResources


"Cluster Computing and MapReduce Lecture" series in YouTube 


http://code.google.com/edu/parallel/mapreduce-tutorial.html 

What is Hadoop?

http://radar.oreilly.com/2012/02/what-is-apache-hadoop.html
http://gigaom.com/cloud/what-it-really-means-when-someone-says-hadoop
http://www.ibm.com/developerworks/data/library/techarticle/dm-1209hadoopbigdata/  

What is HDFS?
 
The paper covers most of the HDFS features except for the HDFS federation which was introduced in 0.23 release and HDFS High Availability feature which will be included in the coming Hadoop release 0.24.

HDFS as comic for the young.

HDFS Federation was introduced in 0.23 release to have multiple NameNodes in a cluster.

About HDFS from `The Architecture of Open Source Applications`.  
MapReduce Algorithms


Hadoop HelloWorld

http://hadoop.apache.org/common/docs/r0.20.205.0/mapred_tutorial.html

Setting up a Hadoop Cluster (Ubuntu)

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-single-node-cluster/
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/  

Setting up Hadoop (Windows)

http://hortonworks.com/blog/hadoop-in-windows/
http://hortonworks.com/blog/installing-hadoop-on-windows/

http://v-lad.org/Tutorials/Hadoop/00%20-%20Intro.html
http://blogs.msdn.com/b/avkashchauhan/

Benchmarking and Stress Testing an Hadoop Cluster

http://www.michael-noll.com/blog/2011/04/09/benchmarking-and-stress-testing-an-hadoop-cluster-with-terasort-testdfsio-nnbench-mrbench/

http://web.ics.purdue.edu/~fahmad/benchmarks.htm 

Testing Hadoop Jobs

http://www.cloudera.com/blog/2009/07/advice-on-qa-testing-your-mapreduce-jobs/

Hadoop Tutorial

Books

Hadoop - The Definitive Guide (would recommend it - my review here)
Pro Hadoop (Didn't get a chance)

BSP vs MapReduce - http://arxiv.org/abs/1203.2081  

General (uncategorized)

Introduction to HDFS Erasure Coding in Apache Hadoop

Thanks to blog contributors from Cloudera Erasure coding, a new feature in HDFS, can reduce storage overhead by approximately 50% compar...