hadoop-general mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Eric Yang <ey...@yahoo-inc.com>
Subject Re: Scribe vs. Chukwa
Date Mon, 30 Nov 2009 21:55:58 GMT
Hi Kim,

Scribe works well for simple deployment.  The complexity increases when
"central scribe server" is multi-machines deployment.  Basically, it
requires a reverse proxy to load balance the data collection.  (
http://www.cloudera.com/blog/2008/11/02/configuring-and-using-scribe-for-had
oop-log-collection/ )  I have not used scribe personally, therefore someone
else could fill in the experience.

Chukwa was designed to be fault tolerant log collection/analytics platform.
Each chukwa agent automatically creates it's own routing table to chukwa
collectors.  Therefore, Chukwa does not require a reverse proxy.  However,
Chukwa Agent requires knowledge of all collector addresses, hence the
initial deployment complexity may be a little higher than scribe.  The
largest test for Chukwa deployment was 50 Chukwa collectors running on top
of 100 dedicated hadoop nodes to process log files from a data center.
(Which was decommissioned due to lack of log files)  Base on my experience,
a single collector with 8GB allocated RAM could handle all log files from
2000 hadoop nodes + System Metrics (top, df, sar, iostat, netstat, vmstat
output).  

Chukwa does not have a direct log file viewer, instead, it has an analytics
engine which computes various facts and provide reports.  There are frequent
requests about log file viewer but it hasn't been implemented.  We only have
command line utility to dump the log files because it is difficult to view
terabytes of log file.  At some point in the future, when a full body index
engine is implemented, then we will provide log file search.

In essence, it depends on what you are looking for.  If you are looking for
simple log collection and viewer, Scribe is probably a good tool.  If you
are looking for log collection and reporting platform, Chukwa is a good
solution.

Regards,
Eric

On 11/30/09 11:54 AM, "Kim Vogt" <vogt7@llnl.gov> wrote:

> Hi,
> 
> My team is looking into using Scribe or Chukwa for hadoop log collection.  I
> was wondering if anyone had any opinions about one vs. the other?  I
> apologize if this topic was covered before, but I donĀ¹t see a link to the
> archives for this mailing list.
> 
> Thanks,
> 
> Kim


Mime
View raw message