incubator-chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Yang <>
Subject Re: What's Chukwa for?
Date Thu, 03 Dec 2009 00:10:49 GMT
Chukwa is a generic distributed log processing system.  It's primary use
case is to monitor Hadoop cluster.  There are several analytics bundled for
displaying system state, java vm resource usage, Hadoop dfs, mapreduce
metrics.  However, anyone could add their own analytics system to run in

In general, the monitoring system is usually independent from the subject
which being monitored.  Chukwa documentation might look like you need two
clusters for this to work.  However, it's actually possible for Chukwa to
run on the same cluster as it's monitoring.

It's better to call chukwa as a reporting system if Chukwa is running on the
same cluster.  If hadoop crashed in this type of deployment, chukwa would
not be responsible for not alerting.


On 12/2/09 3:30 PM, "MJ Lai" <> wrote:

> Hi.
> It is another ``what for'' question.
> I went thru the chukwa web site and am still kind of confused by what is
> software really for. Can I say the major purpose is to provide 1) a
> generic distributed log processing system, or 2) this log system is only
> for Hadoop cluster? In case of 1), why do we want to make it tightly
> bound to Hadoop?
> Assume we have a 100-machine cluster (no hadoop), if I deploy Chukwa to
> process the cluster logs, I still need to create another hadoop cluster
> to make it work?
> I think some practical use cases could reduce the confusions of this
> this project.
> Thanks.
> MJ

View raw message