hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Junping Du (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-3212) RMNode State Transition Update with DECOMMISSIONING state
Date Mon, 14 Sep 2015 13:13:46 GMT

    [ https://issues.apache.org/jira/browse/YARN-3212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14743495#comment-14743495

Junping Du commented on YARN-3212:

Thanks [~leftnoteasy] for review and comments!
bq. 1. Why shutdown a "decommissioning" NM if it is doing heartbeat. Should we allow it continue
heartbeat, since RM needs to know about container finished / killed information.
We don't shutdown a "decommissioning" NM. On the contrary, we differentiates nodes in decommissioning
from others which get false in nodesListManager.isValidNode() check so it can still get running
instead of decommissioned.

bq. 2. Do we have timeout of graceful decomission? Which will update a node to "DECOMMISSIONED"
after the timeout.
There are some discussions in umbrella JIRA (https://issues.apache.org/jira/browse/YARN-914?focusedCommentId=14314653&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14314653),
so we decide to track timeout in CLI instead of RM. The CLI patch (YARN-3225) also shows that.

bq. 3. If I understand correct, decommissioning is another running state, except: We cannot
allocate any new containers to it.
Exactly. Another different is available resource should get updated with each running container
get finished.

bq. If answer to question #2 is no, I suggest to rename RMNodeEventType.DECOMISSION_WITH_TIMEOUT
to GRACEFUL_DECOMISSION, since it doesn't have a "real" timeout.
Already replied above that we support timeout in CLI. DECOMISSION_WITH_TIMEOUT sounds more
clear comparing with old DECOMMISSION event. Thoughts?

bq. Why this is need? .addTransition(NodeState.DECOMMISSIONING, NodeState.DECOMMISSIONING,
RMNodeEventType.DECOMMISSION_WITH_TIMEOUT, new DecommissioningNodeTransition(NodeState.DECOMMISSIONING))
If not adding this transition, an InvalidStateTransitionException will get thrown in our state
machine which sounds not right for a normal operation.

bq. Should we simply ignore the DECOMMISSION_WITH_TIMEOUT event?
No. RM should aware this event so later do some precisely update on available resource, etc.

bq. Is there specific considerations that transfer UNHEALTHY to DECOMISSIONED when DECOMMISSION_WITH_TIMEOUT
received? Is it better to transfer it to DECOMISSIONING since it has some containers running
on it?
I don't have a strong preference in this case. However, my previous consideration is UNHEALTHY
event comes from machine monitor which indicate the node is not quite suitable for containers
keep running while DECOMMISSION_WITH_TIMEOUT comes from user who is prefer to decommission
a batch of nodes without affecting app/container running if there are currently running *normally*.
So I think make it get decommissioned sounds a simpler way before we have more operation experience
with this new feature. I have similiar view on discussion above on UNHEALTHY event to a decommissioning
event (https://issues.apache.org/jira/browse/YARN-3212?focusedCommentId=14693360&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14693360).
May be we can retrospect on this later?

bq. One suggestion of how to handle node update to scheduler: I think you can add a field
"isDecomissioning" to NodeUpdateSchedulerEvent, and scheduler can do all updates except allocate
Thanks for good suggestion here. YARN-3223 will handle the balance of NM's total resource
and used resource (so available resource is always 0). So this could be an option that we
can use this way (new scheduler event) to keep NM resource balanced. There are also other
options too so we can move the discussion to that JIRA I think.

> RMNode State Transition Update with DECOMMISSIONING state
> ---------------------------------------------------------
>                 Key: YARN-3212
>                 URL: https://issues.apache.org/jira/browse/YARN-3212
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Junping Du
>            Assignee: Junping Du
>         Attachments: RMNodeImpl - new.png, YARN-3212-v1.patch, YARN-3212-v2.patch, YARN-3212-v3.patch,
YARN-3212-v4.1.patch, YARN-3212-v4.patch, YARN-3212-v5.1.patch, YARN-3212-v5.patch
> As proposed in YARN-914, a new state of “DECOMMISSIONING” will be added and can transition
from “running” state triggered by a new event - “decommissioning”. 
> This new state can be transit to state of “decommissioned” when Resource_Update if
no running apps on this NM or NM reconnect after restart. Or it received DECOMMISSIONED event
(after timeout from CLI).
> In addition, it can back to “running” if user decides to cancel previous decommission
by calling recommission on the same node. The reaction to other events is similar to RUNNING

This message was sent by Atlassian JIRA

View raw message