hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ming Ma (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-914) Support graceful decommission of nodemanager
Date Sat, 20 Dec 2014 00:55:13 GMT

    [ https://issues.apache.org/jira/browse/YARN-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14254382#comment-14254382
] 

Ming Ma commented on YARN-914:
------------------------------

[~djp], thanks for working on this.

It looks like we are going to use YARN-291 and thus the "drain the state" approach, instead
of the more complicated "migrate the state" approach. So YARN will reduce the capacity of
the nodes as part of the decomission process until all its map output are fetched or until
all the applications the node touches have completed? In addition, it will be interesting
to understand how you handle long running jobs.

FYI, https://issues.apache.org/jira/browse/YARN-1996 will drain containers of unhealthy nodes.


> Support graceful decommission of nodemanager
> --------------------------------------------
>
>                 Key: YARN-914
>                 URL: https://issues.apache.org/jira/browse/YARN-914
>             Project: Hadoop YARN
>          Issue Type: Improvement
>    Affects Versions: 2.0.4-alpha
>            Reporter: Luke Lu
>            Assignee: Junping Du
>
> When NMs are decommissioned for non-fault reasons (capacity change etc.), it's desirable
to minimize the impact to running applications.
> Currently if a NM is decommissioned, all running containers on the NM need to be rescheduled
on other NMs. Further more, for finished map tasks, if their map output are not fetched by
the reducers of the job, these map tasks will need to be rerun as well.
> We propose to introduce a mechanism to optionally gracefully decommission a node manager.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message