hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Daniel Zhi (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4676) Automatic and Asynchronous Decommissioning Nodes Status Tracking
Date Wed, 27 Apr 2016 19:11:13 GMT

    [ https://issues.apache.org/jira/browse/YARN-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15260765#comment-15260765
] 

Daniel Zhi commented on YARN-4676:
----------------------------------

Just to clarify/repeat my understanding of current behavior (w/o this patch) in case I misread
the code: It appears to me that regardless whether RM work-preserving restart is enabled or
not, upon RM restart, NodesListManager creates pseudo RMNodeImpl for each excluded node and
DECOMMISSION the node right away. Maybe there was intention to resume the DECOMMISSIONING,
but I don't see current code is actually doing that.

> Automatic and Asynchronous Decommissioning Nodes Status Tracking
> ----------------------------------------------------------------
>
>                 Key: YARN-4676
>                 URL: https://issues.apache.org/jira/browse/YARN-4676
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>    Affects Versions: 2.8.0
>            Reporter: Daniel Zhi
>            Assignee: Daniel Zhi
>              Labels: features
>         Attachments: GracefulDecommissionYarnNode.pdf, YARN-4676.004.patch, YARN-4676.005.patch,
YARN-4676.006.patch, YARN-4676.007.patch, YARN-4676.008.patch, YARN-4676.009.patch, YARN-4676.010.patch,
YARN-4676.011.patch, YARN-4676.012.patch, YARN-4676.013.patch
>
>
> DecommissioningNodeWatcher inside ResourceTrackingService tracks DECOMMISSIONING nodes
status automatically and asynchronously after client/admin made the graceful decommission
request. It tracks DECOMMISSIONING nodes status to decide when, after all running containers
on the node have completed, will be transitioned into DECOMMISSIONED state. NodesListManager
detect and handle include and exclude list changes to kick out decommission or recommission
as necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message