incubator-chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Yang <>
Subject Re: about metric processing template for large cluster
Date Sun, 16 Jan 2011 20:23:35 GMT

Chukwa uses pig script to analyze the data.  Hence, the analytics is
entirely up to the developer and researcher.  For topN, sorting, it
can be written easily with piglatin.  Take a look of  This script is used
to aggregate large node metrics into a cluster summary number.  It
should help you to calculate histogram and load distribution.

For rrd type of down sampling, we need to introduce a Pig UDF which
calculates d(metric)/dt.


On Sun, Jan 16, 2011 at 7:17 AM, ZHOU Qi <> wrote:
> Hi Guys,
> I got used to using ganglia liked software for monitoring and trouble
> shooting cluster with about 100 machines. But with the growth of
> scale, I found out it became more difficult to identify the abnormal
> metrics, machines or the bottle-net part of the current system.
> Up to now, we considered to add some features for rrd viewing, such as
> getting the topN, sorting the machine by its metrics, or grouping the
> metrics to find its distribution. We have no more experience on chukwa
> before and I am wondering that is there any templates for metrics
> processing from chukwa (such as sorting, histogram, machine/rack group
> distribution) ?
> If you have better idea for viewing these metrics. Would you mind
> introducing it?

View raw message