hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Arun Suresh (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (YARN-5501) Container Pooling in YARN
Date Sat, 18 Feb 2017 17:00:45 GMT

    [ https://issues.apache.org/jira/browse/YARN-5501?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15873247#comment-15873247
] 

Arun Suresh edited comment on YARN-5501 at 2/18/17 5:00 PM:
------------------------------------------------------------

Great discussion.

I would suggest we make the following simplify assumptions for an initial cut.

h4. 1. Concept of detach and attach.
>From the doc, it looks like "detach" implies removing the pre-initialized container from
the pool and "attach" referrs to associating an app with a pooled container. It might be simpler
if we treat the operation as atomic. In that sense, we can make do with just having an "attach"
or "lease", where a pre-initialized container is associated with an app.

h4. 2. Use once and throw away.
For the sake of simplicity. Maybe we should assume that once an application is assigned a
container from the pool and it has "attached" to it, it is the application's container and
the Pooling framework relinquishes ownership of. The container then completes normally and
all resource accounting is billed against the app. The pool of containers can be re-populated
externally by the pool manager component in the RM (beyond the scope of this currently)

h4. 3. Resource accounting.
This is one of the reasons why I feel generalized resources would be useful here. Assume initialy
we have a cluster with resources <10 vcores, 10 GB> spread across 2 NMs equally. Lets
say we allocate 4 pre-initialized containers (via the pooling component in the RM) of type
*foo* each with <1 vcore, 1 GB>. Lets say's we distribute it equally across the NMs.
Once the pre-initialized containers have started, the total cluster resources would be <6
vcores, 6 GB, 4 foo>.
Each NM would have  <3 vcores, 3 GB, 2 foo> available resources. Now if an app asks
for <0 vcores, 0 GB, 1 foo>, it will be allocated 1 pooled container and the resources
associated with 1 foo <1 vcore, 1 GB> can be accounted against the app. The app can
also maybe ask for <1 vcore, 1 GB, 1 foo>, in which case, the app will still be assigned
one of the pooled containers with the assumption that, the container's size can expand by
<1 vcore, 1 GB> if required. Cgroups/JobObjects to be used to enforce this.

h4. 4. AM Container communication.
As raised by [~jlowe], It is currently unclear what happens if the app framework requires
an umbilical connection back to the AM, how does the pre-initialized container know where
that AM is and how to connect. Currently, the *ContainerLaunchContext* should contain all
context required by the container to operate, this includes the location of the AM and how
to talk to it. This is usually application specific (The *TaskUmbillical* protocol used by
MR for eg.) If the container is pre-initialized, this implies that the container is in some
stand-by state waiting for this context to be passed to it. We can should call this out the
design doc:
# The "attach" process will pass the application's ContainerLaunchContext to the pre-initialized
container.
# The feature requires some smarts in the ContainerExecutor, that knows how to pass the LaunchContext
specific to the "type" of pre-initialized container to the container, which itself should
somehow konw that it is pre-iniitialized and in some stand-by state. We can leverage some
of the *Container Runtime* features for this.
# TODO later: Introduce an NM/Executor <-> container protocol to formalized the above,
which maybe useful for long running containers.

Thoughts?


was (Author: asuresh):
Great discussion.

I would suggest we make the following simplify assumptions for an initial cut.

h4. 1. Concept of detach and attach.
>From the doc, it looks like "detach" implies removing the pre-initialized container from
the pool and "attach" referrs to associating an app with a pooled container. It might be simpler
if we treat the operation as atomic. In that sense, we can make do with just having an "attach"
or "lease", where a pre-initialized container is associated with an app.

h4. 2. Use once and throw away.
For the sake of simplicity. Maybe we should assume that once an application is assigned a
container from the pool and it has "attached" to it, it is the application's container and
the Pooling framework relinquishes ownership of. The container then completes normally and
all resource accounting is billed against the app. The pool of containers can be re-populated
externally by the pool manager component in the RM (beyond the scope of this currently)

h4. 3. Resource accounting.
This is one of the reasons why I feel generalized resources would be useful here. Assume initialy
we have a cluster with resources <10 vcores, 10 GB> spread across 2 NMs equally. Lets
say we allocate 4 pre-initialized containers (via the pooling component in the RM) of type
*foo* each with <1 vcore, 1 GB>. Lets say's we distribute it equally across the NMs.
Once the pre-initialized containers have started, the total cluster resources would be <6
vcores, 6 GB, 4 foo>.
Each NM would have  <3 vcores, 3 GB, 2 foo> available resources. Now if an app asks
for <0 vcores, 0 GB, 1 foo>, it will be allocated 1 pooled container and the resources
associated with 1 foo <1 vcore, 1 GB> can be accounted against the app. The app can
also maybe ask for <1 vcore, 1 GB, 1 foo>, in which case, the app will still be assigned
one of the pooled containers with the assumption that, the container's size can expand by
<1 vcore, 1 GB> if required. Cgroups/JobObjects to be used to enforce this.

h4. 4. AM Container communication.
As raised by [~jlowe], It is currently unclear what happens if the app framework requires
an umbilical connection back to the AM, how does the pre-initialized container know where
that AM is and when to connect. Currently, the *ContainerLaunchContext* should contain all
context required by the container to operate, this includes the location of the AM and how
to talk to it. This is usually application specific (The *TaskUmbillical* protocol used by
MR for eg.) If the container is pre-initialized, this implies that the container is in some
stand-by state waiting for this context to be passed to it. We can should call this out the
design doc:
# The "attach" process will pass the application's ContainerLaunchContext to the pre-initialized
container.
# The feature requires some smarts in the ContainerExecutor, that knows how to pass the LaunchContext
specific to the "type" of pre-initialized container to the container, which itself should
somehow konw that it is pre-iniitialized and in some stand-by state. We can leverage some
of the *Container Runtime* features for this.
# TODO later: Introduce an NM/Executor <-> container protocol to formalized the above,
which maybe useful for long running containers.

Thoughts?

> Container Pooling in YARN
> -------------------------
>
>                 Key: YARN-5501
>                 URL: https://issues.apache.org/jira/browse/YARN-5501
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Arun Suresh
>            Assignee: Hitesh Sharma
>         Attachments: Container Pooling in YARN.pdf, Container Pooling - one pager.pdf
>
>
> This JIRA proposes a method for reducing the container launch latency in YARN. It introduces
a notion of pooling *Unattached Pre-Initialized Containers*.
> Proposal in brief:
> * Have a *Pre-Initialized Container Factory* service within the NM to create these unattached
containers.
> * The NM would then advertise these containers as special resource types (this should
be possible via YARN-3926).
> * When a start container request is received by the node manager for launching a container
requesting this specific type of resource, it will take one of these unattached pre-initialized
containers from the pool, and use it to service the container request.
> * Once the request is complete, the pre-initialized container would be released and ready
to serve another request.
> This capability would help reduce container launch latencies and thereby allow for development
of more interactive applications on YARN.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org


Mime
View raw message