incubator-chukwa-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From MJ Lai <>
Subject Re: What's Chukwa for?
Date Thu, 03 Dec 2009 23:42:51 GMT

Eric, Ari.

Thanks for your responses.

I have some extended questions. I'm trying to figure out what is the 
best way to manage a production hadoop cluster, features include:
- deployment: install hadoop from one console;
- monitoring
- configure
- log/system analytics
- software upgrade
- etc.

It seems Ganglia/Nagio are widely used to monitor hadoop cluster 
metrics, and Chukwa is for log analytics. But still, it is a pain to 
manage/configure a hadoop cluster.

Since Chukwa has an agent installed in each endpoint, do you have any 
plan to build it as a universal platform for hadoop management?


Eric Yang wrote:
> Chukwa is a generic distributed log processing system.  It's primary use
> case is to monitor Hadoop cluster.  There are several analytics bundled for
> displaying system state, java vm resource usage, Hadoop dfs, mapreduce
> metrics.  However, anyone could add their own analytics system to run in
> Chukwa.
> In general, the monitoring system is usually independent from the subject
> which being monitored.  Chukwa documentation might look like you need two
> clusters for this to work.  However, it's actually possible for Chukwa to
> run on the same cluster as it's monitoring.
> It's better to call chukwa as a reporting system if Chukwa is running on the
> same cluster.  If hadoop crashed in this type of deployment, chukwa would
> not be responsible for not alerting.
> Regards,
> Eric
> On 12/2/09 3:30 PM, "MJ Lai" <> wrote:
>> Hi.
>> It is another ``what for'' question.
>> I went thru the chukwa web site and am still kind of confused by what is
>> software really for. Can I say the major purpose is to provide 1) a
>> generic distributed log processing system, or 2) this log system is only
>> for Hadoop cluster? In case of 1), why do we want to make it tightly
>> bound to Hadoop?
>> Assume we have a 100-machine cluster (no hadoop), if I deploy Chukwa to
>> process the cluster logs, I still need to create another hadoop cluster
>> to make it work?
>> I think some practical use cases could reduce the confusions of this
>> this project.
>> Thanks.
>> MJ

View raw message