hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun Suresh (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1040) De-link container life cycle from the process and add ability to execute multiple processes in the same long-lived container
Date Wed, 24 Feb 2016 21:52:18 GMT

    [ https://issues.apache.org/jira/browse/YARN-1040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15163864#comment-15163864

Arun Suresh commented on YARN-1040:

Thanks for the feedback [~bikassaha]

I understand we might not want to place artificial constraint of apps, I was just trying to
scope out the bare min effort required specifically for long running container upgrades. That
said, im all for going the whole hog (allow 0 or 1+ processes) if that is maybe easier.

Some thoughts specifically with regard to container upgrade:
# If we allow multiple processes per container, we might need to have {{startProcess()}} to
return maybe a *processId* which can subsequently be used by the AM to address the process
in subsequent calls like {{stopProcess()}}. This might complicate the state of AM, and maybe
we can leave it out in the first cut.
# w.r.t resource re-localization, as per YARN-4597, we are exploring localization as a service
and possibly re-localization on the fly.
# I like the idea of clubbing multiple API calls in the same RPC. But should *upgrade* be
a first class semantic, or should it be expressed as a {{localize v2, start v2, stop v1}}
API combo. One reason to distinguish may be in the case of having both versions up at the
same time till the new version stabilizes... in an upgrade case, the Container should probably
be allowed to go 2x its allocated resource limit for a period of time, but in the case were
we are just starting 2 processes, this should probably not be allowed.

> De-link container life cycle from the process and add ability to execute multiple processes
in the same long-lived container
> ----------------------------------------------------------------------------------------------------------------------------
>                 Key: YARN-1040
>                 URL: https://issues.apache.org/jira/browse/YARN-1040
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: nodemanager
>    Affects Versions: 3.0.0
>            Reporter: Steve Loughran
> The AM should be able to exec >1 process in a container, rather than have the NM automatically
release the container when the single process exits.
> This would let an AM restart a process on the same container repeatedly, which for HBase
would offer locality on a restarted region server.
> We may also want the ability to exec multiple processes in parallel, so that something
could be run in the container while a long-lived process was already running. This can be
useful in monitoring and reconfiguring the long-lived process, as well as shutting it down.

This message was sent by Atlassian JIRA

View raw message