hadoop-common-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod K V (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HADOOP-4305) repeatedly blacklisted tasktrackers should get declared dead
Date Wed, 22 Oct 2008 04:32:44 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-4305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12641714#action_12641714
] 

Vinod K V commented on HADOOP-4305:
-----------------------------------

Can we count this only from successful jobs?
bq. Even that wouldn't be enough. Blacklisting of a TT by a successful job would only mean
that this TT is not suitable for running this job. We can't generalize it to say that this
TT is not fit for running any job. The later can be concluded only by monitoring TT health,
which should be done independently of job failures.

The proposal here doesn't seem to be a right fix. If we are concerned about batch jobs(similar
jobs), and of same jobs being repetitively submitted, we can addressing the issue by introducing
the concept of a batch and by linking batch jobs by something like a 'batch-id'. By default
all jobs would belong to the default batch. And then, we can consider this batch-id for blacklisting
TTs. Thoughts?

> repeatedly blacklisted tasktrackers should get declared dead
> ------------------------------------------------------------
>
>                 Key: HADOOP-4305
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4305
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Christian Kunz
>            Assignee: Amareshwari Sriramadasu
>             Fix For: 0.20.0
>
>
> When running a batch of jobs it often happens that the same tasktrackers are blacklisted
again and again. This can slow job execution considerably, in particular, when tasks fail
because of timeout.
> It would make sense to no longer assign any tasks to such tasktrackers and to declare
them dead.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message