WELCOME TO BIGDATATRENDZ      WELCOME TO CAMO      Architectural Patterns for Near Real-Time Data Processing with Apache Hadoop      Working with Apache Spark: Or, How I Learned to Stop Worrying and Love the Shuffle     

Friday, 11 December 2015

New in Cloudera Labs: Apache HTrace (incubating)

Via a combination of beta functionality in CDH 5.5 and new Cloudera Labs packages, you now have access to Apache HTrace for doing performance tracing of your HDFS-based applications.
HTrace is a new Apache incubator project that provides a bird’s-eye view of the performance of a distributed system. While log files can provide a peek into important events on a specific node, and metrics can answer questions about aggregate performance, HTrace can follow specific requests all the way through the cluster.
HTrace breaks down requests into sets of trace spans. Each trace span represents a length of time. A single request, such as an HDFS copyToLocal command, will generate many different trace spans. Each trace span has a list of parents that allow you to figure out why it was created and in which larger operation it is involved. Trace spans also have a “TracerId” that identifies which service and process they came from.
Processes like the NameNode, DataNode, and filesystem clients generate trace spans based on the work they’ve done. Periodically, they send these trace spans to a span receiver such as htraced for storage and indexing. The htraced daemon has a graphical user interface for examining trace spans sent by many services running on many different hosts.
CDH now includes HTrace, starting in CDH 5.5. Currently, only HDFS tracing is enabled, but integration with other components is coming soon. (Note: While previous versions of CDH may include some HTrace jar files, they do not have all the trace hooks required to use HTrace.) 

How-to: Use Parquet with Impala, Hive, Pig, and MapReduce

Source: Cloudera Blog The CDH software stack lets you use your tool of choice with the Parquet file format – – offering the benefits of ...