hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Karthik Kambatla (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-914) (Umbrella) Support graceful decommission of nodemanager
Date Sat, 19 Dec 2015 01:23:47 GMT

    [ https://issues.apache.org/jira/browse/YARN-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065119#comment-15065119

Karthik Kambatla commented on YARN-914:

bq. On the other hand, there are additional details and component level designs that the JIRA
design document not necessarily discuss or touch. 
Are you able to share these details in an "augmented" design doc? Agreeing on the design would
greatly help with review/commits later.

As far as implementation goes, it is recommended to create subtasks as you see fit. Note that
it is easier to review smaller chunks of code. Also, since you guys have implemented it already,
can you comment on how much of the code changes are in frequently updated parts? If not much,
it might make sense to develop on a branch and merge it to trunk. 

> (Umbrella) Support graceful decommission of nodemanager
> -------------------------------------------------------
>                 Key: YARN-914
>                 URL: https://issues.apache.org/jira/browse/YARN-914
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: graceful
>    Affects Versions: 2.0.4-alpha
>            Reporter: Luke Lu
>            Assignee: Junping Du
>         Attachments: Gracefully Decommission of NodeManager (v1).pdf, Gracefully Decommission
of NodeManager (v2).pdf, GracefullyDecommissionofNodeManagerv3.pdf
> When NMs are decommissioned for non-fault reasons (capacity change etc.), it's desirable
to minimize the impact to running applications.
> Currently if a NM is decommissioned, all running containers on the NM need to be rescheduled
on other NMs. Further more, for finished map tasks, if their map output are not fetched by
the reducers of the job, these map tasks will need to be rerun as well.
> We propose to introduce a mechanism to optionally gracefully decommission a node manager.

This message was sent by Atlassian JIRA

View raw message