Showing posts from December, 2015

New in Cloudera Labs: Apache HTrace (incubating)

Via a combination of beta functionality in CDH 5.5 and new Cloudera Labs packages, you now have access to Apache HTrace for doing performance tracing of your HDFS-based applications. HTrace is a new Apache incubator project that provides a bird’s-eye view of the performance of a distributed system. While log files can provide a peek into important events on a specific node, and metrics can answer questions about aggregate performance, HTrace can follow specific requests all the way through the cluster. HTrace breaks down requests into sets of trace spans. Each trace span represents a length of time. A single request, such as an HDFS copyToLocal command, will generate many different trace spans. Each trace span has a list of parents that allow you to figure out why it was created and in which larger operation it is involved. Trace spans also have a “TracerId” that identifies which service and process they came from. Processes like the NameNode, DataNode, and filesystem clients generate tr…

Big Data Trendz