hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sunil G (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4386) refreshNodesGracefully() looks at active RMNode list for recommissioning decommissioned nodes
Date Tue, 08 Dec 2015 15:17:11 GMT

    [ https://issues.apache.org/jira/browse/YARN-4386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15046954#comment-15046954
] 

Sunil G commented on YARN-4386:
-------------------------------

Hi [~kshukla]
Sorry for replying late here. 
bq. Unless there are 2 refreshNodes done in parallel such that the first deactivateNodeTransition
has not finished and the other refreshNodes is also trying to do the same transition
Since the transitions are happening under write lock, this may not happen.

I have one suggestion here.
I feel You could mark a node for GRACEFUL DECOMMISSION and ensure that node is in DECOMMISSIONING
state. (can try to fire event to RMNodeImpl directly to do this). Later invoke {{refreshNodesGracefully}}
and verify that an event named RECOMMISSION is raised to dispatcher or not. Similarly mark
a node as DECOMMISSIONED and then  invoke {{refreshNodesGracefully}} and verify the event
RECOMMISSION is *NOT* raised. In second case, it will not enter *for* loop. but I feel this
will clear cover our case here though its not direct.
Pls correct me if I am wrong.

> refreshNodesGracefully() looks at active RMNode list for recommissioning decommissioned
nodes
> ---------------------------------------------------------------------------------------------
>
>                 Key: YARN-4386
>                 URL: https://issues.apache.org/jira/browse/YARN-4386
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: graceful
>    Affects Versions: 3.0.0
>            Reporter: Kuhu Shukla
>            Assignee: Kuhu Shukla
>            Priority: Minor
>         Attachments: YARN-4386-v1.patch
>
>
> In refreshNodesGracefully(), during recommissioning, the entryset from getRMNodes() which
has only active nodes (RUNNING, DECOMMISSIONING etc.) is used for checking 'decommissioned'
nodes which are present in getInactiveRMNodes() map alone. 
> {code}
> for (Entry<NodeId, RMNode> entry:rmContext.getRMNodes().entrySet()) { .........................
>  // Recommissioning the nodes
>         if (entry.getValue().getState() == NodeState.DECOMMISSIONING
>             || entry.getValue().getState() == NodeState.DECOMMISSIONED) {
>           this.rmContext.getDispatcher().getEventHandler()
>               .handle(new RMNodeEvent(nodeId, RMNodeEventType.RECOMMISSION));
>         }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message