hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Junping Du (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-4676) Automatic and Asynchronous Decommissioning Nodes Status Tracking
Date Tue, 02 Aug 2016 18:34:20 GMT

    [ https://issues.apache.org/jira/browse/YARN-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15404555#comment-15404555
] 

Junping Du commented on YARN-4676:
----------------------------------

Thanks for sharing these details, Daniel. 
bq. In the typical EMR cluster scenario, daemon like NM will be configured to auto-start if
killed/shutdown, however RM will reject such NM if it appear in the exclude list.
In today's YARN (community version), if RM reject NM's register request, NM should get terminated
directly. I think we should follow existing behavior or it could be incompatible issues there.


bq. 1, DECOMMISSIONED NM, will try to register to RM but will be rejected. It continue such
loop until either: 1) the host being terminated; 2) the host being recommissioned. It was
likely the DECOMMISSIONED->LOST transition is defensive coding — without it invalid event
throws.
I can understand we want to gain the scale in and out capability here for cluster's elasticity.
However, I am not sure how much benefit we can gain from this hacking behavior - it sounds
like we just saving NM daemon start time which is several seconds in most cases which is trivial
comparing with container launching ad running. Do I miss other benefit here?

bq. It was likely the DECOMMISSIONED->LOST transition is defensive coding — without it
invalid event throws.
As I mentioned above, we should remove watching DECOMMISSIONED node which is unnecessary to
consume RM resource to take care of it. If EXPIRE event get throw in your case, then we should
check something wrong there (like race-condition, etc.) and fix there.

bq. CLEANUP_CONTAINER and CLEANUP_APP were for sure added to prevent otherwise invalid event
exception at the DECOMMISSIONED state
I can understand we want to get rid of any annoy invalid transition in our logs. However,
similar to what I mentioned above, we need to find out where we send these events and check
if these case are valid or belongs to bug due to race condition, etc. Even if we really sure
some of events are hard to get rid of, we should empty the transition here as any logic in
transition is not necessary. 

> Automatic and Asynchronous Decommissioning Nodes Status Tracking
> ----------------------------------------------------------------
>
>                 Key: YARN-4676
>                 URL: https://issues.apache.org/jira/browse/YARN-4676
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>    Affects Versions: 2.8.0
>            Reporter: Daniel Zhi
>            Assignee: Daniel Zhi
>              Labels: features
>         Attachments: GracefulDecommissionYarnNode.pdf, GracefulDecommissionYarnNode.pdf,
YARN-4676.004.patch, YARN-4676.005.patch, YARN-4676.006.patch, YARN-4676.007.patch, YARN-4676.008.patch,
YARN-4676.009.patch, YARN-4676.010.patch, YARN-4676.011.patch, YARN-4676.012.patch, YARN-4676.013.patch,
YARN-4676.014.patch, YARN-4676.015.patch, YARN-4676.016.patch, YARN-4676.017.patch, YARN-4676.018.patch,
YARN-4676.019.patch
>
>
> YARN-4676 implements an automatic, asynchronous and flexible mechanism to graceful decommission
> YARN nodes. After user issues the refreshNodes request, ResourceManager automatically
evaluates
> status of all affected nodes to kicks out decommission or recommission actions. RM asynchronously
> tracks container and application status related to DECOMMISSIONING nodes to decommission
the
> nodes immediately after there are ready to be decommissioned. Decommissioning timeout
at individual
> nodes granularity is supported and could be dynamically updated. The mechanism naturally
supports multiple
> independent graceful decommissioning “sessions” where each one involves different
sets of nodes with
> different timeout settings. Such support is ideal and necessary for graceful decommission
request issued
> by external cluster management software instead of human.
> DecommissioningNodeWatcher inside ResourceTrackingService tracks DECOMMISSIONING nodes
status automatically and asynchronously after client/admin made the graceful decommission
request. It tracks DECOMMISSIONING nodes status to decide when, after all running containers
on the node have completed, will be transitioned into DECOMMISSIONED state. NodesListManager
detect and handle include and exclude list changes to kick out decommission or recommission
as necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message