tez-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Siddharth Seth (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (TEZ-1567) Avoid blacklisting nodes when the disable blacklisting threshold is about to be hit
Date Wed, 22 Oct 2014 20:05:34 GMT

    [ https://issues.apache.org/jira/browse/TEZ-1567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14180442#comment-14180442
] 

Siddharth Seth commented on TEZ-1567:
-------------------------------------

Uploading another patch which makes the following changes.
- Removes shouldBlacklistNode
- Instead registerAsBlacklisted (renamed to registerBadNodeAndShouldBlacklist) returns a boolean
indicating whether the node should be blacklisted.
- Computing the ignore threshold once per node update.

bq. To maintain existing behavior of unblacklisting all nodes when the threshold is hit, if
2) fails to blacklist then the node can send an event to AMNodeImpl that triggers sendIngoreBlacklistingStateToNodes()
or executes the sendIngoreBlacklistingStateToNodes() code inside AMNodeImpl itself.
I think sending the event to ignoreBlacklisting belongs in the AMNodeTracker, and not in individual
nodes.
We shouldn't be sending the ignoreBlacklisting message to all nodes each time a node fails
(irrespective of the current state of this flag). computeIgnoreBlacklisting takes care of
that - and takes care of sending the event only when ignore blacklisting is enabled / disabled
(which also needs to be dealt with).

> Avoid blacklisting nodes when the disable blacklisting threshold is about to be hit
> -----------------------------------------------------------------------------------
>
>                 Key: TEZ-1567
>                 URL: https://issues.apache.org/jira/browse/TEZ-1567
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Siddharth Seth
>            Assignee: Siddharth Seth
>         Attachments: TEZ-1567.1.txt, TEZ-1567.2.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message