hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amareshwari Sriramadasu (JIRA)" <j...@apache.org>
Subject [jira] Updated: (HADOOP-4305) repeatedly blacklisted tasktrackers should get declared dead
Date Thu, 13 Nov 2008 11:49:44 GMT

     [ https://issues.apache.org/jira/browse/HADOOP-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Amareshwari Sriramadasu updated HADOOP-4305:

    Attachment: patch-4305-1.txt

Here is a patch with proposed fix.
The patch does the following:
*  Adds the blacklisted trackers of the job to the potentially faulty list, in JobTracker.finalizeJob()
*  The tracker is moved to blacklisted trackers (across jobs) from potentially faulty list
   ** #blacklists  exceed mapred.max.tracker.blacklists (default value is 4),
   **  #blacklists is 50% above the average #blacklists, over the active and potentially faulty
   **  50% the cluster is not blacklisted yet
* Restarting the tracker makes it an active tracker
* After a day, the tarcker is given a chance again to run tasks
* Adds #blacklisted_trackers to ClusterStatus
* Updates web UI to show the blacklisted trackers.

> repeatedly blacklisted tasktrackers should get declared dead
> ------------------------------------------------------------
>                 Key: HADOOP-4305
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4305
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Christian Kunz
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.20.0
>         Attachments: patch-4305-1.txt
> When running a batch of jobs it often happens that the same tasktrackers are blacklisted
again and again. This can slow job execution considerably, in particular, when tasks fail
because of timeout.
> It would make sense to no longer assign any tasks to such tasktrackers and to declare
them dead.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message