hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Devaraj Das (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4305) repeatedly blacklisted tasktrackers should get declared dead
Date Tue, 02 Dec 2008 13:58:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12652368#action_12652368

Devaraj Das commented on HADOOP-4305:

Some comments:
1. Format if condition brackets properly in incrementFaults method
2. You should be able to use the same datastructure for both potentiallyFaulty and blacklisted
3. Add a comment for mapred.cluster.average.blacklist.threshold that it is there solely for
tuning purposes and once this feature has been tested in real clusters and an appropriate
value for the threshold has been found, this config might be taken out.
4. Check whether you can remove initialContact flag and use only the restarted flag in the
heartbeat method. This is a more serious change but might be worthwhile in simplifying the
state machine.

> repeatedly blacklisted tasktrackers should get declared dead
> ------------------------------------------------------------
>                 Key: HADOOP-4305
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4305
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Christian Kunz
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.20.0
>         Attachments: patch-4305-0.18.txt, patch-4305-1.txt, patch-4305-2.txt
> When running a batch of jobs it often happens that the same tasktrackers are blacklisted
again and again. This can slow job execution considerably, in particular, when tasks fail
because of timeout.
> It would make sense to no longer assign any tasks to such tasktrackers and to declare
them dead.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message