hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod Kumar Vavilapalli (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1040) De-link container life cycle from an Allocation
Date Sun, 20 Mar 2016 20:15:34 GMT

    [ https://issues.apache.org/jira/browse/YARN-1040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15203496#comment-15203496

Vinod Kumar Vavilapalli commented on YARN-1040:

Thanks for the document, [~asuresh]!

 - Our comments crossed on Feb 25, so didn't see yours.
 - Looking at the doc, I can see why it gives the impression of a redesign, but it is less
of a redesign, and more of adding new functionality that needs new semantics.
 - Clearly the new naming makes it look like a lot of new changes for the apps, but that is
the reality (for apps that want to use this new feature)!
 - We do make most of our decisions on JIRA. We can continue the discussion here. If need
be, sure, we can send out a note on the dev lists.

So, with that out of the way, let's step back and look at the semantics first and foremost
and keep out the discussions about renames and the expected level of changes for later.

h4. APIs
There are big differences between the two proposals w.r.t the APIs. Even though it looks like
your proposal earlier assumes that this can be made a localized change in the NM side APIs,
there are newer semantics that mandate new (and/or modified) APIs  on both AM-NM and RM-AM
interactions. A couple of them that come to my mind
 - *Allocation/container release*: We need two separate mechanisms from AM to RM for (a) releasing
allocations whole-sale (and thereby kill all running containers inside) and (b) kill one or
more containers running inside an allocation *directly* at the RM - this is an existing feature
- because the app either doesn't want to open N connections to N nodes in the cluster, or
simply because the NM is not accessible anymore/in-the-interim.
 - *Allocation/container exit notifications*: The AMs will further be interested in two separate
back-notifications from the RM (a) is the allocation itself released completely by the platform
- say due to preemption? (b) or has one of the containers running inside the allocation exited
and so I have to act on it? Remember that this is simply a disambiguation of our existing
container-exit notification mechanism.

h4. Internals
Internally inside the RM too, the state-machine of the allocation itself is different from
the containers' life-cycle. For e.g., the containers' life-cycle determines the completion
notifications that we send across to the AMs and only the allocation life-cycle impacts scheduling.

h4. Compatibility for existing apps
What is proposed in the doc as well as the way I originally described it, it is definitely
backwards compatible. Existing applications do not need a single line of change. Only newer
versions of applications that desire to use the new feature have to use newer APIs - something
that is not different from any other core YARN feature at all.

h4. Changes for apps that want to use the new feature
Even in your proposal, an app/framework that desires to use the new feature has to make non-significant
changes in the AM to use this feature correctly
 - generating containerIDs
 - managing the list of containers running inside an allocation
 - managing the outstanding unused portion of an allocation, and incrementally launching more
and more containers till the allocation is full
 - Containers running under non-reusable allocations do not need an explicit signal to the
RM for clean up - apps can simply stop the container on the NM and everything else gets automatically
taken care of. Apps that start using new feature on the other hand will *have* to now also
explicitly release allocations outside of the life-cycle of the containers.
 - We can optionally add auxiliary flags to inform NMs to auto-reap the allocation when the
last-container dies - only for apps that are okay with this -, but either ways the apps need
changes to do this as they intend it.
 - Apps will also have to react differently on container-exit notifications and allocation-released/preempted

Given the points above, I don't think we can get away with just an NM side API change.

Depending on how much we have to change the APIs, I am willing to go either way on the degree
of renames in the API surface area. Inside the code base though, I think we are better off
calling things what they are.

> De-link container life cycle from an Allocation
> -----------------------------------------------
>                 Key: YARN-1040
>                 URL: https://issues.apache.org/jira/browse/YARN-1040
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager
>    Affects Versions: 3.0.0
>            Reporter: Steve Loughran
>         Attachments: YARN-1040-rough-design.pdf
> The AM should be able to exec >1 process in a container, rather than have the NM automatically
release the container when the single process exits.
> This would let an AM restart a process on the same container repeatedly, which for HBase
would offer locality on a restarted region server.
> We may also want the ability to exec multiple processes in parallel, so that something
could be run in the container while a long-lived process was already running. This can be
useful in monitoring and reconfiguring the long-lived process, as well as shutting it down.

This message was sent by Atlassian JIRA

View raw message