hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Bikas Saha (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (YARN-1011) [Umbrella] Schedule containers based on utilization of currently allocated containers
Date Tue, 05 Jan 2016 22:29:40 GMT

    [ https://issues.apache.org/jira/browse/YARN-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15083957#comment-15083957

Bikas Saha commented on YARN-1011:

I agree with natural container churn in favor of preemption to avoid lost work though the
issue of clearly defining scheduler policy still remains.

bq.  If we were oversubscribing 10X then I'd probably want it for sure, but if it's at most
2X capacity then worst case is a container only gets 50% of the resource it had requested.
Obviously for something like memory this has to be closely controlled because going over the
physical capabilities of the machine has very significant consequences. But for CPU, I'd definitely
be inclined to live with the occasional 50% worst case for all containers, in order to avoid
the 1/1024th worst case for OPPORTUNISTIC containers on a busy node.
I did not understand this. Does this mean, its ok for normal containers to run 50% slower
in the presence of opportunistic containers? If yes, then there are scenarios where this may
not be a valid choice. E.g. when a cluster is running a mix of SLA and non-SLA jobs. Non-SLA
jobs are ok if there containers got slowed down to increase cluster utilization by running
opportunistic containers because we are getting higher overall throughput. But SLA jobs are
not ok with missing deadlines because there tasks ran 50% slower. 

IMO, the litmus test for a feature like this would be to take an existing cluster (with low
utilization because tasks are asking for more resources than what they need 100% of the time).
Then turn this feature on and get better cluster utilization and throughput without affecting
the existing workload. Whatever be the internal implementation details. Agree?

bq. 50% of maximum-under-utilized resource of past 30 min for each NM can be used to allocate
opportunistic containers.
These are heuristics and may all be valid under different circumstances. What we should step
back and see is what is the source of this optimization.
Observation : Cluster is under-utilized despite being fully allocated
Possible reasons : 
1) Tasks are incorrectly over-allocated. Will never use the resources they ask for and hence
we can safely run additional opportunistic containers. So this feature is used to compensate
for poorly configured applications. Probably a valid scenario but is it common?
2) Tasks are correctly allocated but dont use their capacity to the limit all the time. E.g.
Terasort will use high cpu only during the sorting but not during the entire length of the
job. But its containers will ask for enough CPU to run the sort in the desired time. This
is a typical application behavior where resource usage varies over time. So this feature is
used to soak up the fallow resources in the cluster while tasks are not using their quoted

The arguments and assumptions we make need to be considered in the light of which of 1 or
2 is the common case and where this feature will be useful.

While its useful to have configuration knobs, for a complex dynamic feature like this that
is basically reacting to runtime observations, it may be quite hard to be able to configure
this statically using manual configuration. While some limits about max over-allocation limit
etc. are easy and probably required to configure, we should look at making this feature work
by itself instead of relying exclusively on configuration (hell :P) for users to make this
feature usable.

> [Umbrella] Schedule containers based on utilization of currently allocated containers
> -------------------------------------------------------------------------------------
>                 Key: YARN-1011
>                 URL: https://issues.apache.org/jira/browse/YARN-1011
>             Project: Hadoop YARN
>          Issue Type: New Feature
>            Reporter: Arun C Murthy
>         Attachments: yarn-1011-design-v0.pdf, yarn-1011-design-v1.pdf
> Currently RM allocates containers and assumes resources allocated are utilized.
> RM can, and should, get to a point where it measures utilization of allocated containers
and, if appropriate, allocate more (speculative?) containers.

This message was sent by Atlassian JIRA

View raw message