hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Smith, Joshua D." <Joshua.Sm...@gd-ais.com>
Subject RE: State of Art in Hadoop Log aggregation
Date Fri, 11 Oct 2013 13:38:25 GMT
I've used Splunk in the past for log aggregation. It's commercial/proprietary, but I think
there's a free version.
http://www.splunk.com/


From: Raymond Tay [mailto:raymondtay1974@gmail.com]
Sent: Friday, October 11, 2013 1:39 AM
To: user@hadoop.apache.org
Subject: Re: State of Art in Hadoop Log aggregation

You can try Chukwa which is part of the incubating projects under Apache. Tried it before
and liked it for aggregating logs.

On 11 Oct, 2013, at 1:36 PM, Sagar Mehta <sagarmehta@gmail.com<mailto:sagarmehta@gmail.com>>
wrote:


Hi Guys,

We have fairly decent sized Hadoop cluster of about 200 nodes and was wondering what is the
state of art if I want to aggregate and visualize Hadoop ecosystem logs, particularly

  1.  Tasktracker logs
  2.  Datanode logs
  3.  Hbase RegionServer logs
One way is to use something like a Flume on each node to aggregate the logs and then use something
like Kibana - http://www.elasticsearch.org/overview/kibana/ to visualize the logs and make
them searchable.

However I don't want to write another ETL for the hadoop/hbase logs  themselves. We currently
log in to each machine individually to 'tail -F logs' when there is an hadoop problem on a
particular node.

We want a better way to look at the hadoop logs themselves in a centralized way when there
is an issue without having to login to 100 different machines and was wondering what is the
state of are in this regard.

Suggestions/Pointers are very welcome!!

Sagar


Mime
View raw message