hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Thomas Graves (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (MAPREDUCE-2451) Log the reason string of healthcheck script
Date Wed, 04 May 2011 13:43:03 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-2451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13028747#comment-13028747

Thomas Graves commented on MAPREDUCE-2451:

For the 20.205 branch these are based on branch-0.20-security here are the results of test-patch.
 The -1 for javadoc and eclipse both exists in the branch-20-security without my change. 

 I didn't include any tests because this is a trivial change to the log message.  Manual test
steps are below.

     [exec] -1 overall.
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec]     -1 tests included.  The patch doesn't appear to include any new or modified
     [exec]                         Please justify why no tests are needed for this patch.
     [exec]     -1 javadoc.  The javadoc tool appears to have generated 1 warning messages.
     [exec]     +1 javac.  The applied patch does not increase the total number of javac compiler
     [exec]     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs warnings.
     [exec]     -1 Eclipse classpath. The patch causes the Eclipse classpath to differ from
the contents of the lib directories.     [exec]     [exec]

Manual test steps: 
- modify mapred_site.xml to have the mapred.healthChecker.script.path configuration.  Have
it point to a script like ~/health_check. 
- modify ~/health_check to contain just something like:
exit 0

- start the cluster and make sure every is running fine.
- modify the ~/health_check script on a tasktracker and insert the a line like: echo -n "ERROR
new string"  before the exit 0 line. 
- wait for tasktracker to send heartbeat back with updated health info
- look in the jobtracker log file and verify the log line looks similar to this. This bug
added the "Reason details : ERROR new string" bit.
2011-05-04 13:30:31,926 INFO org.apache.hadoop.mapred.JobTracker: Blacklisting tracker : yourhost.com
Reason for blacklisting is : NODE_UNHEALTHY Reason details : ERROR new string
- also verify the tasktracker got blacklisted.

> Log the reason string of healthcheck script
> -------------------------------------------
>                 Key: MAPREDUCE-2451
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2451
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: jobtracker
>    Affects Versions:
>            Reporter: Thomas Graves
>            Assignee: Thomas Graves
>            Priority: Trivial
>             Fix For:, 0.23.0
>         Attachments: MAPREDUCE-2451-20.205.patch, MAPREDUCE-2451-trunk.patch
>   Original Estimate: 24h
>  Remaining Estimate: 24h
> The information on why a specific TaskTracker got blacklisted is not stored anywhere.
The jobtracker web ui will show the detailed reason string until the TT gets unblacklisted.
 After that it is lost.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message