hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Junping Du (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-914) Support graceful decommission of nodemanager
Date Mon, 16 Feb 2015 18:42:14 GMT

    [ https://issues.apache.org/jira/browse/YARN-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14323124#comment-14323124

Junping Du commented on YARN-914:

Thanks [~jlowe] for review and comments!
Sounds good. Will update it later.

bq. We should remove its available (not total) resources from the cluster then continue to
remove available resources as containers complete on that node. 
That's a very good point. Yes. we should update resource in this way.

bq. As for the UI changes, initial thought is that decommissioning nodes should still show
up in the active nodes list since they are still running containers. A separate decommissioning
tab to filter for those nodes would be nice, although I suppose users can also just use the
jquery table to sort/search for nodes in that state from the active nodes list if it's too
crowded to add yet another node state tab (or maybe get rid of some effectively dead tabs
like the reboot state tab).
Make sense. Will add to proposal and can discuss more details on UI JIRA later.

bq. For the NM restart open question, this should no longer an issue now that the NM is unaware
of graceful decommission.

bq. For the AM dealing with being notified of decommissioning, again I think this should just
be treated like a strict preemption for the short term. IMHO all the AM needs to know is that
the RM is planning on taking away those containers, and what the AM should do about it is
similar whether the reason for removal is preemption or decommissioning.

bq. Back to the long running services delaying decommissioning concern, does YARN even know
the difference between a long-running container and a "normal" container? 
I am afraid not now. YARN-1039 should be a start to do the differentiation.

bq. If it doesn't, how is it supposed to know a container is not going to complete anytime
soon? Even a "normal" container could run for many hours. It seems to me the first thing we
would need before worrying about this scenario is the ability for YARN to know/predict the
expected runtime of containers.
I think prediction of expected runtime of containers could be hard in YARN case. However,
can we typically say long running service containers are expected to run very long or infinite?
If so, notifying AM to preempt containers of LRS make more sense here than waiting here for
timeout. Isn't it? 

bq. There's still an open question about tracking the timeout RM side instead of NM side.
Sounds like the NM side is not going to be pursued at this point, and we're going with no
built-in timeout support in YARN for the short-term.
That was unclear at the beginning of discussion but much clear now, will remove this part.

> Support graceful decommission of nodemanager
> --------------------------------------------
>                 Key: YARN-914
>                 URL: https://issues.apache.org/jira/browse/YARN-914
>             Project: Hadoop YARN
>          Issue Type: Improvement
>    Affects Versions: 2.0.4-alpha
>            Reporter: Luke Lu
>            Assignee: Junping Du
>         Attachments: Gracefully Decommission of NodeManager (v1).pdf, Gracefully Decommission
of NodeManager (v2).pdf
> When NMs are decommissioned for non-fault reasons (capacity change etc.), it's desirable
to minimize the impact to running applications.
> Currently if a NM is decommissioned, all running containers on the NM need to be rescheduled
on other NMs. Further more, for finished map tasks, if their map output are not fetched by
the reducers of the job, these map tasks will need to be rerun as well.
> We propose to introduce a mechanism to optionally gracefully decommission a node manager.

This message was sent by Atlassian JIRA

View raw message