New in Cloudera Labs: Apache HTrace (incubating)

Via a combination of beta functionality in CDH 5.5 and new Cloudera Labs packages, you now have access to Apache HTrace for doing performance tracing of your HDFS-based applications.
HTrace is a new Apache incubator project that provides a bird’s-eye view of the performance of a distributed system. While log files can provide a peek into important events on a specific node, and metrics can answer questions about aggregate performance, HTrace can follow specific requests all the way through the cluster.
HTrace breaks down requests into sets of trace spans. Each trace span represents a length of time. A single request, such as an HDFS copyToLocal command, will generate many different trace spans. Each trace span has a list of parents that allow you to figure out why it was created and in which larger operation it is involved. Trace spans also have a “TracerId” that identifies which service and process they came from.
htrace-f1
Processes like the NameNode, DataNode, and filesystem clients generate trace spans based on the work they’ve done. Periodically, they send these trace spans to a span receiver such as htraced for storage and indexing. The htraced daemon has a graphical user interface for examining trace spans sent by many services running on many different hosts.
CDH now includes HTrace, starting in CDH 5.5. Currently, only HDFS tracing is enabled, but integration with other components is coming soon. (Note: While previous versions of CDH may include some HTrace jar files, they do not have all the trace hooks required to use HTrace.) 

Installing htraced

The first thing you have to do to get started with HTrace is to install htraced, which Cloudera is making available today in the form of a Cloudera Labs package. (As with all Cloudera Labs projects, this functionality is not formally supported, but we do encourage you to try it out via the steps below.)

Installation using Parcels

If you are using parcels, you can install htraced from within Cloudera Manager. Click on the “Parcels” icon in the top bar, then click on the “Edit Settings” button. Add the following link to the list of URLs enumerated in the “Remote Parcel Repository URLs” setting:
htrace-f2
Go to Hosts->Parcels and hit the “Check for new parcels” button. You should see CLABS_HTRACE show up in the list. At this point, you can distribute and activate the parcel. (See the these docs for more information about parcels.) 
htrace-f3

Installation using Packages

If you are using packages, you can find the relevant .rpm or .deb for your particular operating system underhttp://archive.cloudera.com/cloudera-labs/htrace/. Then install the package using your operating system’s package tool, such as rpm on Red Hat or dpkg on Ubuntu. Since htraced uses native code, be sure to download the right package for your operating system. Red Hat Linux 6 and greater, Ubuntu 12.04 and greater, and other popular Linux distributions are supported.

Verifying the Installation

Once htraced is installed, you should be able to log into any host and type the following to get the installed version of htraced.

Installing the htraced Cloudera Service Description

If you are using Cloudera Manager, you will want to install the HTraced CSD file.
On the host that is running Cloudera Manager’s server process, log in and download the CSD to/opt/cloudera/csd. You should be able to tell which host is the Cloudera Manager server process since it is the host that exposes the Cloudera Manager web interface.
Then, restart the cloudera server process:
After the Cloudera Manager web interface gets finishes restarting, log in again.
From the menu at the far left marked “clusters,” select “Cloudera Management Service.” This is the service that monitors hosts for Cloudera Manager. Restart this service so that it will begin to monitor htraced.
htrace-f4

Configuring htraced

htraced should usually run alongside a DataNode, rather than alongside one of the NameNodes. That way, it will not be competing with the NameNode for network bandwidth. Just like a DataNode, htraced can use additional hard disks for more bandwidth. It is usually a good practice to configure htraced with at least six storage directories. Each directory should be located on a distinct hard disk if possible. htraced can share hard disks with the local DataNode.
On the node where you want to run htraced, create several storage directories and make sure they are owned by the htrace user. For example, if you wanted to create /data/1/htraced/data/2/htraced, and so forth as storage directories on example.com, you would do:
htrace-f5
If the CSD has been installed successfully, you should be able to see htraced as an option when you select “Add a new service” for your cluster. Select this option. You will see a prompt asking you which storage directories to use. Enter in the storage directories you created earlier.

htraced Cluster Configuration

Once htraced is running, the next step is to configure your HDFS daemons and clients to send trace spans to htraced. We’ll do this by adding some new configuration settings to Hadoop’s core-site.xml file. If you are using Cloudera Manager, add the following to the box marked “Cluster-wide Advanced Configuration Snippet (Safety Valve) for core-site.xml.” You should be able to find this under Configuration -> Advanced Configuration Snippets.
In the configuration above, replace “example.com” with the hostname of the machine on which you are running htraced.
HTrace doesn’t necessarily create a trace span for every request. Instead, it samples a subset of all requests. This approach allows you to monitor the cluster without impacting performance too much. In the configuration above, ProbabilitySampler will create a trace span when a random number between 0 and 1 is less than 0.001, or for approximately 1 in 1000 requests.
The configuration above will trace a random sample of all the requests flowing through the HDFS client. But you can also trace requests at higher levels than that. For example, if you would like to trace only FSShellrequests, remove the fs.client.htrace.sampler.classes line, and replace it with anfs.shell.htrace.sampler.classes line. (The FSShell is the HDFS command-line interface.)  Usually, you will only want to turn on one sampler at a time.
htrace-f6
You will want to redeploy this configuration to all HDFS clients. You can do that by hitting “deploy client configuration” in Cloudera Manager. After that is complete, restart your HDFS service so that it can get the new settings.
Once the configuration has been deployed, you can use the hadoop trace tool to verify that HTrace is enabled on the daemons. For example, if your namenode is named my-namenode.example.com, you should be able to see this output:

Troubleshooting htraced Configuration Issues

Sometimes, the hadoop trace command will show that you have no span receivers even though you have applied the configuration above.
In this case, check the log4j file of the NameNode (or other daemon). 
If you see messages like “Cannot find SpanReceiver class,” you know that the htrace-htraced-cdh5.jar is not in the CLASSPATH of the daemon. Verify that the htrace-htraced-cdh5.jar is in the correct place under/opt/cloudera/parcels/CDH/lib/hadoop. Note that you must re-install HTrace when you upgrade CDH, since the CDH upgrade will remove the htraced jar. If you have installed the CDH parcel to a non-default location, you will have to create the symlink manually.
If your service logfiles do not mention “SpanReceiver,” then the core-site.xml configuration has probably not been applied. You can use the hdfs getconf command to verify that the client configuration has been deployed to all nodes. If your core-site.xml configuration is accessible to the shell, you should see this output:
Check to make sure that your core-site.xml configuration files include the htrace configuration.

The htraced Web UI

Now that we have HTrace installed, you should be able to see the htraced daemon running within Cloudera Manager. Click on the htraced role and follow Cloudera Manager’s “Web interface” link to get to the htraced graphical user interface. By default, the htraced web interface will be on port 9096 on the host where htraced was installed.
The “ServerInfo” tab has information about the htraced server itself, including version information and metrics. Here you can check the total number of spans that were ingested, as well as how many were sent from each host in the cluster. This is a good way to make sure that all your hosts are configured to send spans to htraced. When clients send spans to htraced, the “Spans Ingested” will increase.
htrace-f7

Finding Slow DataNodes with HTrace

Let’s see if we can use htraced’s “Search” page to find DataNodes in the cluster that are having problems.
We would like to find trace spans that happened on the DataNode. To do that, we set up a filter by “TracerId contains” and enter “DataNode”. We would like to sort the results by descending order of duration. We can do that by entering in “Duration is at most” a very large number. This shows us a list of the longest spans which happened on the DataNode.
htrace-f8
Double-clicking on one of these spans, we can see that it is over 260 seconds in duration!
htrace-f9
Taking a closer look at the really long request here, we can see that the DataNode at 172.28.209.62 took most of the time. The amount of time taken by the remaining two DataNodes in the pipeline was very small. We can also see that the name of the file which was being written was /temp3/1._COPYING_.
htrace-f10
This information from HTrace lets us know that the DataNode at 172.28.209.62, is having problems. It’s worth taking a closer look at that node.

Tuning htraced

htraced works best when it has all the resources it needs to do its job, and isn’t under load that is too heavy. The amount of load on htraced will depend on several things:
  1. The size of your cluster: a 10-node cluster will put a relatively small load on htraced; a 300-node cluster will put a relatively large one on it
  2. The sampling rate you have configured
  3. The resources htraced has available, such as storage directories and file descriptors
If you have a larger cluster, you will want to turn down the sampling rate to avoid sending too many spans to htraced.
On the “server information” tab on the htraced web UI, the “Average WriteSpans Time” should be relatively low. The number of spans dropped by the server should be 0. You can also see how many total spans have been sent on this page, and use it to get a rough idea of how many spans per minute are being sent.
If the sampling rate is too high, clients may drop spans because they are unable to send them to htraced as fast as they are generated. If this happens, they will write log messages to a file named /tmp/htracedDropped, and log messages to log4j. If you suspect that the sampling rate is too high, check to see if your cluster nodes have the /tmp/htracedDropped file. Note that once the /tmp/htracedDropped file grows to a certain size, the clients will stop appending log messages to it to avoid running out of space. It is always safe to delete this file. Usually you should remove it once you have examined the contents, so that new messages can be seen if they occur.

Advanced htraced Tuning

htraced benefits from raising the file descriptor limit, which will allow it to keep more files open at once. If more memory is available, you can increase the leveldb.buffer.size configuration to 16MB or even 32MB to improve the efficiency of htraced’s leveldb on-disk data store. Be aware that leveldb.buffer.size will be multiplied by the number of storage directories. For example, if you have 10 storage directories and aleveldb.buffer.size of 16MB, the htraced heap will increase by 160MB. A higher leveldb.buffer.size may also increase the amount of time htraced takes to start up.

Conclusion

With the addition of HTrace to CDH 5.5 and the availability of htraced via Cloudera Labs, you now have a powerful tool for analyzing cluster-wide performance. Although the HTrace functionality is still in beta, in the future, we expect HTrace to integrate with many more systems—including other tracing systems as well as other services within the Apache Hadoop ecosystem. 
For more details about the HTrace architecture, check out the talks about it at ApacheCon, and the upstream developer mailing list. Give it a try in CDH 5.5, and keep in touch on the Cloudera Labs user forums.
Colin McCabe is a Software Engineer at Cloudera, and a member of the Apache Hadoop PMC.

Popular posts from this blog

Architectural Patterns for Near Real-Time Data Processing with Apache Hadoop

INTEGRATE SPARKR AND R FOR BETTER DATA SCIENCE WORKFLOW

How-to: Ingest Email into Apache Hadoop in Real Time for Analysis