hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vinod Kumar Vavilapalli (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1404) Add support for unmanaged containers
Date Thu, 14 Nov 2013 00:37:21 GMT

    [ https://issues.apache.org/jira/browse/YARN-1404?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13822034#comment-13822034
] 

Vinod Kumar Vavilapalli commented on YARN-1404:
-----------------------------------------------

bq. Vinod Kumar Vavilapalli, a lightweight RM is not sufficient because the goal of llama
is to be able to run frameworks that use unmanaged containers alongside frameworks that don't.
While Impala does its own resource enforcement, it wants to coexist on a YARN instance with
MR and other frameworks that fit more naturally with the YARN model.
Well, this has been my problem, I'm sure others will agree. Proposing unmanaged containers
before explaining your key requirements keeps folks only looking at JIRA in the dark.

bq. Are you saying YARN should never support containers that don't launch a process? Is there
anything gained by this?
If that need arises, and if there are no other first-class solutions, then yes. Otherwise
no.

bq. I think you are jumping too fast here
That's because I see multiple JIRAs all trying to achieve a common goal and instead of discussing
that design, we are shoe-horned into debating on individual tickets that don't make up the
overall goal.

bq. IMO that makes completely sense for bugs, for improvements/new-features a description
of it communicates more as it will be the commit message. The shortcomings the JIRA is trying
to address should be captured in the description.
Agree that it is subjective. But in some of the tickets that potentially have a solution-space
> 1, I'd suggest renaming them. For e.g., this on can be renamed to "support running a
service that doesn't want to use YARN containers but still co-exists with YARN"

bq. Take for example the following JIRA summaries, would you change them to describe a problem?
bq.    AM's tracking URL should be a URL instead of a string
bq.    YARN should have a ClusterId/ServiceId
Yes, I'd change the above two. The other two are apt summaries. The goal should be indicating
the problem one is attacking. And my point here is not that you or someone is making that
mistake and others are not.

bq. The whole point of Llama is to allow Impala to share resources in a real Yarn cluster
doing other workloads like Map-Reduce. In other words, Impala/Llama and other AMs must share
cluster resources.
Well, you should have started with this requirement so that we can all discuss and come up
with a solution instead of putting in approaches that you think are best.  This was the same
discussion we had in YARN-689 where it took a while for the rest of us to understand the real
requirements. Similarly, YARN-789 was put in FairScheduler without giving considerations to
the rest of the system.

bq. The AM that started the unmanaged container gets the early-preemption/preemption/lost
notification from the RM and notifies the out of band process in the corresponding node to
release the corresponding resources. (Impala/Llama is doing this today with the dummy sleep
containers)
That won't work for cases where RM wants to forcefully terminate in emergency situations.

bq. A NM plugin notifies the collocated out of band process that the unmanaged container as
ended. This prompts the out of band process to release the corresponding resources. (We are
working on getting this in Impala/Llama).
This again is a new proposal which is never discussed.

Re this problem, I think you should create a ticket about supporting services that want to
use cluster and node level scheduling without using containers. Then if you follow up with
a requirement list, we can discuss solutions and an end-to-end design. I can come with more
solutions already, which may or may not work depending on your requirements.
 - Use the dynamic NM resource stuff that just went in and use signalling between YARN NM
and some outside component to dynamically adjust NM resources
 - Run a long running service under YARN with containers that dynamically grow and shrink

> Add support for unmanaged containers
> ------------------------------------
>
>                 Key: YARN-1404
>                 URL: https://issues.apache.org/jira/browse/YARN-1404
>             Project: Hadoop YARN
>          Issue Type: New Feature
>          Components: nodemanager
>    Affects Versions: 2.2.0
>            Reporter: Alejandro Abdelnur
>            Assignee: Alejandro Abdelnur
>         Attachments: YARN-1404.patch
>
>
> Currently a container allocation requires to start a container process with the corresponding
NodeManager's node.
> For applications that need to use the allocated resources out of band from Yarn this
means that a dummy container process must be started.
> Impala/Llama is an example of such application which is currently starting a 'sleep 10y'
(10 years) process as the container process. And the resource capabilities are used out of
by and the Impala process collocated in the node. The Impala process ensures the processing
associated to that resources do not exceed the capabilities of the container. Also, if the
container is lost/preempted/killed, Impala stops using the corresponding resources.
> In addition, in the case of Llama, the current requirement of having a container process,
gets complicates when hard resource enforcement (memory -ContainersMonitor- or cpu -via cgroups-)
is enabled because Impala/Llama request resources with CPU and memory independently of each
other. Some requests are CPU only and others are memory only. Unmanaged containers solve this
problem as there is no underlying process with zero CPU or zero memory.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message