hadoop-zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Savu Andrei (JIRA)" <j...@apache.org>
Subject [jira] Commented: (ZOOKEEPER-744) Add monitoring four-letter word
Date Sun, 06 Jun 2010 21:54:56 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-744?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12876091#action_12876091
] 

Savu Andrei commented on ZOOKEEPER-744:
---------------------------------------

@Patrick I have fixed 1-5. I will resubmit the patch after writing some tests to ensure that
the node watch count works as expected (I'm having some problems with this part). Right now
all tests are passing.

6. I believe the leader does not record the time of the last election. I will look more into
this and change the code as needed.

Should I also add JVM memory stats?


> Add monitoring four-letter word
> -------------------------------
>
>                 Key: ZOOKEEPER-744
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-744
>             Project: Zookeeper
>          Issue Type: New Feature
>          Components: server
>    Affects Versions: 3.4.0
>            Reporter: Travis Crawford
>            Assignee: Savu Andrei
>             Fix For: 3.4.0
>
>         Attachments: zk-ganglia.png, ZOOKEEPER-744.patch, ZOOKEEPER-744.patch
>
>
> Filing a feature request based on a zookeeper-user discussion.
> Zookeeper should have a new four-letter word that returns key-value pairs appropriate
for importing to a monitoring system (such as Ganglia which has a large installed base)
> This command should initially export the following:
> (a) Count of instances in the ensemble.
> (b) Count of up-to-date instances in the ensemble.
> But be designed such that in the future additional data can be added. For example, the
output could define the statistic in a comment, then print a key "space character" value line:
> """
> # Total number of instances in the ensemble
> zk_ensemble_instances_total 5
> # Number of instances currently participating in the quorum.
> zk_ensemble_instances_active 4
> """
> From the mailing list:
> """
> Date: Mon, 19 Apr 2010 12:10:44 -0700
> From: Patrick Hunt <phunt@apache.org>
> To: zookeeper-user@hadoop.apache.org
> Subject: Re: Recovery issue - how to debug?
> On 04/19/2010 11:55 AM, Travis Crawford wrote:
> > It would be a lot easier from the operations perspective if the leader
> > explicitly published some health stats:
> >
> > (a) Count of instances in the ensemble.
> > (b) Count of up-to-date instances in the ensemble.
> >
> > This would greatly simplify monitoring&  alerting - when an instance
> > falls behind one could configure their monitoring system to let
> > someone know and take a look at the logs.
> That's a great idea. Please enter a JIRA for this - a new 4 letter word 
> and JMX support. It would also be a great starter project for someone 
> interested in becoming more familiar with the server code.
> Patrick
> """

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message