hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Chris Douglas (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3767) Brief, baseline namenode health check
Date Thu, 07 Aug 2008 17:56:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12620696#action_12620696
] 

Chris Douglas commented on HADOOP-3767:
---------------------------------------

bq. should this liveness test include a min #of live datanodes? Like 1?

That seems to be verifying a different property than an internal health check. The number
of live datanodes is also visible through the web interface, at least. The number of datanodes
could be added as a (usually not very interesting) metric, but it would probably fit better
in an SNMP (or similar) layer.

On failed pings: should a server failing a health check change its status, or would that just
invite race conditions?

> Brief, baseline namenode health check
> -------------------------------------
>
>                 Key: HADOOP-3767
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3767
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Chris Douglas
>            Priority: Minor
>         Attachments: 3767-0.patch, 3767-1.patch
>
>
> It would be helpful if there were a way to query the namenode to verify that it is basically
healthy. In particular, that all the expected threads are running, data structures appear
sane, etc. Administrators could use this interface to verify that the namenode is both up
and essentially functional, attaching cron jobs, notification, etc. as required.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message