hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Allen Wittenauer <awittena...@linkedin.com>
Subject Re: Hadoop Node Monitoring
Date Thu, 12 Nov 2009 18:35:07 GMT
On 11/11/09 9:46 PM, "John Martyniak" <john@beforedawnsolutions.com> wrote:
> Is there a good solution for Hadoop node monitoring?  I know that
> Cacti and Ganglia are probably the two big ones, but are they the best
> ones to use?  Easiest to setup? Most thorough reporting, etc.
> I started to play with Ganglia, and the install is crazy, I am
> installing it on CentOS and having all sorts of troubles.  So any idea
> there would be very helpful.

We've working on getting our stats into Zenoss via the JMX connector and
SNMP because Ganglia seems to have some fundamental issues (like grouping of
hosts is a *client* side config).  Note that Zenoss is available in both
open source and commercial forms.  We're using the commercial version, but
the open source version would probably be just as good.

But that aside:

We're taking the approach of grid health by watching and monitoring the
dead/live node count by scraping the NN and JT web pages.  We also do daily
fsck's, lsr's, and run a cut-down version of gridmix.

While monitoring individual nodes is useful in a pro-active sense, the
bigger your grid gets, the less important it becomes.

View raw message