hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Steve Loughran (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-5478) Provide a node health check script and run it periodically to check the node health status
Date Wed, 13 May 2009 14:31:45 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-5478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12708943#action_12708943

Steve Loughran commented on HADOOP-5478:


I could certainly merge this in with HADOOP-3628; I'm busy dealing with svn merge issues right
now, and don't want this held up. I think it would be handy for me if we could run this on
startup and then have ping query the latest state when checked.

>>     Also, could it be a bit of JavaScript instead of a shell script?

>Umm. Can we execute this from the TT directly ? AFAIK, this is not possible, right ? As
of now, there is no plan to support anything other than a shell script.

you can run JS direct from a java6 jvm; the script engine is in the box. I've been wondering
what it would take for JS support in MR jobs, but it could be handy for system health checks
too, though native scripts give you the edge for low level system health. 

> Provide a node health check script and run it periodically to check the node health status
> ------------------------------------------------------------------------------------------
>                 Key: HADOOP-5478
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5478
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: mapred
>    Affects Versions: 0.20.0
>            Reporter: Aroop Maliakkal
>            Assignee: Vinod K V
> Hadoop must have some mechanism to find the health status of a node . It should run the
health check script periodically and if there is any errors, it should black list the node.
This will be really helpful when we run static mapred clusters. Else we may have to run some
scripts/daemons periodically to find the node status and take it offline manually.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message