hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Zhijie Shen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3025) Provide API for retrieving blacklisted nodes
Date Wed, 28 Jan 2015 00:18:35 GMT

    [ https://issues.apache.org/jira/browse/YARN-3025?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14294461#comment-14294461
] 

Zhijie Shen commented on YARN-3025:
-----------------------------------

IMHO, we've mixed two things together in the prior discussion:

1. First. RM should provide the API to let AM retrieve the blacklisted nodes. By doing so,
upon AM crashing and restarting, it can sync with RM to get the last state of the blacklisted
nodes before AM get restarted. This is feasible whether RM persists the blacklisted nodes
into the state store, given they're kept in memory in the scheduler.

2. Second, writing the blacklisted nodes into the state store is necessary only when we even
want to make sure the blacklisted nodes is recoverable over RM restarting. This can be further
divided into two cases: 1) If we want to make sure the blacklisted nodes is recoverable after
ordinary RM restarting, we can just write the latest blacklisted nodes of running apps in
the the state store once upon RM stopping. 2) If we want to make sure the blacklisted nodes
is recoverable after RM crashing, we can update the latest blacklisted nodes in the state
store upon changes as is suggested by Tsuyoshi.



> Provide API for retrieving blacklisted nodes
> --------------------------------------------
>
>                 Key: YARN-3025
>                 URL: https://issues.apache.org/jira/browse/YARN-3025
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Ted Yu
>
> We have the following method which updates blacklist:
> {code}
>   public synchronized void updateBlacklist(List<String> blacklistAdditions,
>       List<String> blacklistRemovals) {
> {code}
> Upon AM failover, there should be an API which returns the blacklisted nodes so that
the new AM can make consistent decisions.
> The new API can be:
> {code}
>   public synchronized List<String> getBlacklistedNodes()
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message