hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-3767) Brief, baseline namenode health check
Date Sat, 19 Jul 2008 06:12:31 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-3767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12614951#action_12614951
] 

Steve Loughran commented on HADOOP-3767:
----------------------------------------

+1 to a ping() operation to a namenode. One issue here is that a full health check a-la fsck
is going to be slow. So ping() could be a quick -are you there, do you think you are live-
kind of query. The real way to assess fs health is actually to perform operations on it and
check the results. This is what ant -diagnostics does: creates a file in ${java.io.tmpdir}
and verifies that it is there with its timestamp roughly aligned with the system clock 

> Brief, baseline namenode health check
> -------------------------------------
>
>                 Key: HADOOP-3767
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3767
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: dfs
>            Reporter: Chris Douglas
>            Priority: Minor
>
> It would be helpful if there were a way to query the namenode to verify that it is basically
healthy. In particular, that all the expected threads are running, data structures appear
sane, etc. Administrators could use this interface to verify that the namenode is both up
and essentially functional, attaching cron jobs, notification, etc. as required.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message