hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bikas Saha (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1292) De-link container life cycle from the process it runs
Date Thu, 10 Oct 2013 22:34:44 GMT

    [ https://issues.apache.org/jira/browse/YARN-1292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13792063#comment-13792063

Bikas Saha commented on YARN-1292:

This can be achieved in a backwards compatible manner in the following way
1) StartContainer request will have a new flag that says whether the container is attached
to a process or not. Default value is true for back-compat.
2) If the above flag is false then the container is completed on the NM only when
a) the RM terminates the container (this currently happens today)
b) when the AM call StopContainer on that (this is currently supported)
The main change in the NM would be to not trigger end of container, ie keep the container
in a running state, when there is no process associated with the container.
3) Create a new api called startProcess() that can be used to launch a new process in a container.
NM can dis-allow starting a process while a process is already running for the first cut.
This API would be secured using existing AMNM token.

No changes are expected to be needed in the RM since the NM will continue to report this container
as running to the RM. This should be a fairly localised NM-only change.

> De-link container life cycle from the process it runs
> -----------------------------------------------------
>                 Key: YARN-1292
>                 URL: https://issues.apache.org/jira/browse/YARN-1292
>             Project: Hadoop YARN
>          Issue Type: Improvement
>    Affects Versions: 2.1.1-beta
>            Reporter: Bikas Saha
> Currently, a container is considered done when its OS process exits. This makes it cumbersome
for apps to be able to reuse containers for different processes. Long running daemons may
want to run in the same containers as the previous versions. So eg. is an hbase region server
crashes/upgraded it would want to restart in the same container where everything it needs
would already be warm and ready.

This message was sent by Atlassian JIRA

View raw message