hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jian He (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-2641) Decommission nodes on -refreshNodes instead of next NM-RM heartbeat
Date Mon, 13 Oct 2014 22:32:34 GMT

    [ https://issues.apache.org/jira/browse/YARN-2641?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14170127#comment-14170127
] 

Jian He commented on YARN-2641:
-------------------------------

looks good to me too, thanks Zhihai !

> Decommission nodes on -refreshNodes instead of next NM-RM heartbeat
> -------------------------------------------------------------------
>
>                 Key: YARN-2641
>                 URL: https://issues.apache.org/jira/browse/YARN-2641
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>    Affects Versions: 2.5.0
>            Reporter: zhihai xu
>            Assignee: zhihai xu
>         Attachments: YARN-2641.000.patch, YARN-2641.001.patch, YARN-2641.002.patch, YARN-2641.003.patch
>
>
> improve node decommission latency in RM. 
> Currently the node decommission only happened after RM received nodeHeartbeat from the
Node Manager. The node heartbeat interval is configurable. The default value is 1 second.
> It will be better to do the decommission during RM Refresh(NodesListManager) instead
of nodeHeartbeat(ResourceTrackerService).
> This will be a much more serious issue:
> After RM is refreshed (refreshNodes), If the NM to be decommissioned is killed before
NM sent heartbeat to RM. The RMNode will never be decommissioned in RM. The RMNode will only
expire in RM after  "yarn.nm.liveness-monitor.expiry-interval-ms"(default value 10 minutes)
time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message