hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Amareshwari Sriramadasu (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4305) repeatedly blacklisted tasktrackers should get declared dead
Date Fri, 07 Nov 2008 08:34:45 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12645718#action_12645718

Amareshwari Sriramadasu commented on HADOOP-4305:

bq. Runping's proposal can be encompassed into Amareshwari's proposal too.....reflect the
state of the TT (how many jobs was it running simultaneously) by incrementing the blacklist
counter with an appropriate weight.

I think you meant how many tasks the tracker was running simultaneously at the time of failure.
But, in steady state all the slots of the tracker will be occupied. then, the blacklist weight
would be same for all the trackers.

bq. The tracker is blacklisted across all jobs if #blacklists is X% above the average #blacklists,
over all the trackers.
Here, The average value may be very skewed, since very few trackers would be faulty.  (In
my previous example, it should be X=2500%)
To avoid skewness, the tracker can be blacklisted across all jobs if
1. #blacklists is greater than *mapred.max.tasktracker.blacklists*  and
2. #blacklists is 50% above the average #blacklists, over all the trackers.

> repeatedly blacklisted tasktrackers should get declared dead
> ------------------------------------------------------------
>                 Key: HADOOP-4305
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4305
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Christian Kunz
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.20.0
> When running a batch of jobs it often happens that the same tasktrackers are blacklisted
again and again. This can slow job execution considerably, in particular, when tasks fail
because of timeout.
> It would make sense to no longer assign any tasks to such tasktrackers and to declare
them dead.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message