hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alejandro Abdelnur (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1404) Add support for unmanaged containers
Date Tue, 12 Nov 2013 15:48:17 GMT

    [ https://issues.apache.org/jira/browse/YARN-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13820200#comment-13820200
] 

Alejandro Abdelnur commented on YARN-1404:
------------------------------------------

[~hitesh], 

bq. How is scheduling management/enforcement (preemption, etc) meant to work with unmanaged
containers?

The AM that started the unmanaged container gets the early-preemption/preemption/lost notification
from the RM and notifies the out of band process in the corresponding node to release the
corresponding resources. (Impala/Llama is doing this today with the dummy sleep containers)

A NM plugin notifies the collocated out of band process that the unmanaged container as ended.
This prompts the out of band process to release the corresponding resources. (We are working
on getting this in Impala/Llama).

In theory, the former is sufficient. In practice, having the later as well, drives a faster
reaction to preemption/lost of resources.

bq. it seems like 2 features are needed: ... NM resource resizing.

IMO, NM resource resizing is orthogonal to unmanaged resources.

bq. it seems like 2 features are needed: container leases ...

In the current proposal the container leases are out of band, they happen between the process
using the resources out of band (i.e. Impala) and the AM (i.e. Llama).

The reason I've taken the approach of leaving the container leases out of band is:

* To keep a single lifecycle for containers instead of two different lifecycles. This keeps
intact the current state transitions reducing the changes of introducing errors there now
or when the lifecycle evolves.

* A lease would require an additional call to the renew the lease. This would require introducing
lease tokens as the lease could be done by an out of band system.

* If the RM is the recipient of lease renewals is the RM we are adding additional responsibilities
to the RM, and handling additional clients, potentially from several new clients (out of band
processes).

* If the NM is the recipient of the lease, we still need a flag when launching the container
to indicate the NM that the container is unmanaged and leases will be coming in.

IMO, I don't think we gain much by having Yarn to manage leases from unmanaged containers
as it is still in the hands of the out of band process using the the container resources to
effectively release the resources when asked to.

Thoughts?



> Add support for unmanaged containers
> ------------------------------------
>
>                 Key: YARN-1404
>                 URL: https://issues.apache.org/jira/browse/YARN-1404
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: nodemanager
>    Affects Versions: 2.2.0
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>         Attachments: YARN-1404.patch
>
>
> Currently a container allocation requires to start a container process with the corresponding
NodeManager's node.
> For applications that need to use the allocated resources out of band from Yarn this
means that a dummy container process must be started.
> Impala/Llama is an example of such application which is currently starting a 'sleep 10y'
(10 years) process as the container process. And the resource capabilities are used out of
by and the Impala process collocated in the node. The Impala process ensures the processing
associated to that resources do not exceed the capabilities of the container. Also, if the
container is lost/preempted/killed, Impala stops using the corresponding resources.
> In addition, in the case of Llama, the current requirement of having a container process,
gets complicates when hard resource enforcement (memory -ContainersMonitor- or cpu -via cgroups-)
is enabled because Impala/Llama request resources with CPU and memory independently of each
other. Some requests are CPU only and others are memory only. Unmanaged containers solve this
problem as there is no underlying process with zero CPU or zero memory.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message