hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alejandro Abdelnur (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1404) Add support for unmanaged containers
Date Wed, 13 Nov 2013 18:41:22 GMT

    [ https://issues.apache.org/jira/browse/YARN-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13821657#comment-13821657
] 

Alejandro Abdelnur commented on YARN-1404:
------------------------------------------

[~stevel@apache.org]

bq. 1. I'd be inclined to treat this as a special case of YARN-1040

I've just commented in YARN-1040 following Bikas' comment on this https://issues.apache.org/jira/browse/YARN-1040?focusedCommentId=13821597&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13821597

bq. It's dangerously easy to leak containers here; I know llama keeps an eye on things, but
I worry about other people's code -though admittedly, any long-lived command line app "yes"
could do the same.

We can have NM configs to disable no-process or multi-process, but still you can workaround
this around by having a dummy process. This is how Llama is doing things today, but it is
not ideal for several reasons.

IMO, from Yarn perspective we need to allow AMs to be able to do sophisticated things within
the Yarn programming model (like you are trying to do with long-lived containers or what I'm
doing with Llama).

bq. For the multi-process (and that includes processes=0), we really do need some kind of
lease renewal option to stop containers being retained forever. It would become the job of
the AM to do the renewal

As I've mentioned above, I don't think we need a special lease for this: https://issues.apache.org/jira/browse/YARN-1404?focusedCommentId=13820200&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13820200
(look for 'The reason I've taken the approach of leaving the container leases out of band
is:')

[~vinodkv]

bq. -1 for this...

I think you are jumping too fast here.

bq. As I repeated on other JIRAs, please change the title with the problem statement instead
of solutions.

IMO that makes completely sense for bugs, for improvements/new-features a description of it
communicates more as it will be the commit message. The shortcomings the JIRA is trying to
address should be captured in the description.

Take for example the following JIRA summaries, would you change them to describe a problem?

* AHS should support application-acls and queue-acls
* AM's tracking URL should be a URL instead of a string
* Add support for zipping/unzipping logs while in transit for the NM logs web-service
* YARN should have a ClusterId/ServiceId

bq. I indicated offline about llama with others. I don't think you need NodeManagers either
to do what you want, forget about containers. All you need is use the ResourceManager/scheduler
in isolation using MockRM/LightWeightRM (YARN-1385) - your need seems to be using the scheduling
logic in YARN and not use the physical resources.

The whole point of Llama is to allow Impala to share resources in a real Yarn cluster doing
other workloads like Map-Reduce. In other words, Impala/Llama and other AMs must share cluster
resources. 


> Add support for unmanaged containers
> ------------------------------------
>
>                 Key: YARN-1404
>                 URL: https://issues.apache.org/jira/browse/YARN-1404
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: nodemanager
>    Affects Versions: 2.2.0
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>         Attachments: YARN-1404.patch
>
>
> Currently a container allocation requires to start a container process with the corresponding
NodeManager's node.
> For applications that need to use the allocated resources out of band from Yarn this
means that a dummy container process must be started.
> Impala/Llama is an example of such application which is currently starting a 'sleep 10y'
(10 years) process as the container process. And the resource capabilities are used out of
by and the Impala process collocated in the node. The Impala process ensures the processing
associated to that resources do not exceed the capabilities of the container. Also, if the
container is lost/preempted/killed, Impala stops using the corresponding resources.
> In addition, in the case of Llama, the current requirement of having a container process,
gets complicates when hard resource enforcement (memory -ContainersMonitor- or cpu -via cgroups-)
is enabled because Impala/Llama request resources with CPU and memory independently of each
other. Some requests are CPU only and others are memory only. Unmanaged containers solve this
problem as there is no underlying process with zero CPU or zero memory.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message