hadoop-yarn-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ashwin Shankar (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (YARN-2026) Fair scheduler : Fair share for inactive queues causes unfair allocation in some scenarios
Date Wed, 28 May 2014 23:56:02 GMT

     [ https://issues.apache.org/jira/browse/YARN-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Ashwin Shankar updated YARN-2026:
---------------------------------

    Description: 
Problem1- While using hierarchical queues in fair scheduler,there are few scenarios where
we have seen a leaf queue with least fair share can take majority of the cluster and starve
a sibling parent queue which has greater weight/fair share and preemption doesn’t kick in
to reclaim resources.

The root cause seems to be that fair share of a parent queue is distributed to all its children
irrespective of whether its an active or an inactive(no apps running) queue. Preemption based
on fair share kicks in only if the usage of a queue is less than 50% of its fair share and
if it has demands greater than that. When there are many queues under a parent queue(with
high fair share),the child queue’s fair share becomes really low. As a result when only
few of these child queues have apps running,they reach their *tiny* fair share quickly and
preemption doesn’t happen even if other leaf queues(non-sibling) are hogging the cluster.

This can be solved by dividing fair share of parent queue only to active child queues.

Here is an example describing the problem and proposed solution:
root.lowPriorityQueue is a leaf queue with weight 2
root.HighPriorityQueue is parent queue with weight 8
root.HighPriorityQueue has 10 child leaf queues : root.HighPriorityQueue.childQ(1..10)

Above config,results in root.HighPriorityQueue having 80% fair share
and each of its ten child queue would have 8% fair share. Preemption would happen only if
the child queue is <4% (0.5*8=4). 

Lets say at the moment no apps are running in any of the root.HighPriorityQueue.childQ(1..10)
and few apps are running in root.lowPriorityQueue which is taking up 95% of the cluster.
Up till this point,the behavior of FS is correct.

Now,lets say root.HighPriorityQueue.childQ1 got a big job which requires 30% of the cluster.
It would get only the available 5% in the cluster and preemption wouldn't kick in since its
above 4%(half fair share).This is bad considering childQ1 is under a highPriority parent queue
which has *80% fair share*.

Until root.lowPriorityQueue starts relinquishing containers,we would see the following allocation
on the scheduler page:
*root.lowPriorityQueue = 95%*
*root.HighPriorityQueue.childQ1=5%*

This can be solved by distributing a parent’s fair share only to active queues.

So in the example above,since childQ1 is the only active queue
under root.HighPriorityQueue, it would get all its parent’s fair share i.e. 80%.
This would cause preemption to reclaim the 30% needed by childQ1 from root.lowPriorityQueue
after fairSharePreemptionTimeout seconds.

Problem2 - Also note that similar situation can happen between root.HighPriorityQueue.childQ1
and root.HighPriorityQueue.childQ2,if childQ2 hogs the cluster. childQ2 can take up 95% cluster
and childQ1 would be stuck at 5%,until childQ2 starts relinquishing containers. We would like
each of childQ1 and childQ2 to get half of root.HighPriorityQueue  fair share ie 40%,which
would ensure childQ1 gets upto 40% resource if needed through preemption.

  was:
While using hierarchical queues in fair scheduler,there are few scenarios where we have seen
a leaf queue with least fair share can take majority of the cluster and starve a sibling parent
queue which has greater weight/fair share and preemption doesn’t kick in to reclaim resources.

The root cause seems to be that fair share of a parent queue is distributed to all its children
irrespective of whether its an active or an inactive(no apps running) queue. Preemption based
on fair share kicks in only if the usage of a queue is less than 50% of its fair share and
if it has demands greater than that. When there are many queues under a parent queue(with
high fair share),the child queue’s fair share becomes really low. As a result when only
few of these child queues have apps running,they reach their *tiny* fair share quickly and
preemption doesn’t happen even if other leaf queues(non-sibling) are hogging the cluster.

This can be solved by dividing fair share of parent queue only to active child queues.

Here is an example describing the problem and proposed solution:
root.lowPriorityQueue is a leaf queue with weight 2
root.HighPriorityQueue is parent queue with weight 8
root.HighPriorityQueue has 10 child leaf queues : root.HighPriorityQueue.childQ(1..10)

Above config,results in root.HighPriorityQueue having 80% fair share
and each of its ten child queue would have 8% fair share. Preemption would happen only if
the child queue is <4% (0.5*8=4). 

Lets say at the moment no apps are running in any of the root.HighPriorityQueue.childQ(1..10)
and few apps are running in root.lowPriorityQueue which is taking up 95% of the cluster.
Up till this point,the behavior of FS is correct.

Now,lets say root.HighPriorityQueue.childQ1 got a big job which requires 30% of the cluster.
It would get only the available 5% in the cluster and preemption wouldn't kick in since its
above 4%(half fair share).This is bad considering childQ1 is under a highPriority parent queue
which has *80% fair share*.

Until root.lowPriorityQueue starts relinquishing containers,we would see the following allocation
on the scheduler page:
*root.lowPriorityQueue = 95%*
*root.HighPriorityQueue.childQ1=5%*

This can be solved by distributing a parent’s fair share only to active queues.

So in the example above,since childQ1 is the only active queue
under root.HighPriorityQueue, it would get all its parent’s fair share i.e. 80%.
This would cause preemption to reclaim the 30% needed by childQ1 from root.lowPriorityQueue
after fairSharePreemptionTimeout seconds.

Also note that similar situation can happen between root.HighPriorityQueue.childQ1 and root.HighPriorityQueue.childQ2,if
childQ2 hogs the cluster. childQ2 can take up 95% cluster and childQ1 would be stuck at 5%,until
childQ2 starts relinquishing containers. We would like each of childQ1 and childQ2 to get
half of root.HighPriorityQueue  fair share ie 40%,which would ensure childQ1 gets upto 40%
resource if needed through preemption.


> Fair scheduler : Fair share for inactive queues causes unfair allocation in some scenarios
> ------------------------------------------------------------------------------------------
>
>                 Key: YARN-2026
>                 URL: https://issues.apache.org/jira/browse/YARN-2026
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: scheduler
>            Reporter: Ashwin Shankar
>            Assignee: Ashwin Shankar
>              Labels: scheduler
>         Attachments: YARN-2026-v1.txt
>
>
> Problem1- While using hierarchical queues in fair scheduler,there are few scenarios where
we have seen a leaf queue with least fair share can take majority of the cluster and starve
a sibling parent queue which has greater weight/fair share and preemption doesn’t kick in
to reclaim resources.
> The root cause seems to be that fair share of a parent queue is distributed to all its
children irrespective of whether its an active or an inactive(no apps running) queue. Preemption
based on fair share kicks in only if the usage of a queue is less than 50% of its fair share
and if it has demands greater than that. When there are many queues under a parent queue(with
high fair share),the child queue’s fair share becomes really low. As a result when only
few of these child queues have apps running,they reach their *tiny* fair share quickly and
preemption doesn’t happen even if other leaf queues(non-sibling) are hogging the cluster.
> This can be solved by dividing fair share of parent queue only to active child queues.
> Here is an example describing the problem and proposed solution:
> root.lowPriorityQueue is a leaf queue with weight 2
> root.HighPriorityQueue is parent queue with weight 8
> root.HighPriorityQueue has 10 child leaf queues : root.HighPriorityQueue.childQ(1..10)
> Above config,results in root.HighPriorityQueue having 80% fair share
> and each of its ten child queue would have 8% fair share. Preemption would happen only
if the child queue is <4% (0.5*8=4). 
> Lets say at the moment no apps are running in any of the root.HighPriorityQueue.childQ(1..10)
and few apps are running in root.lowPriorityQueue which is taking up 95% of the cluster.
> Up till this point,the behavior of FS is correct.
> Now,lets say root.HighPriorityQueue.childQ1 got a big job which requires 30% of the cluster.
It would get only the available 5% in the cluster and preemption wouldn't kick in since its
above 4%(half fair share).This is bad considering childQ1 is under a highPriority parent queue
which has *80% fair share*.
> Until root.lowPriorityQueue starts relinquishing containers,we would see the following
allocation on the scheduler page:
> *root.lowPriorityQueue = 95%*
> *root.HighPriorityQueue.childQ1=5%*
> This can be solved by distributing a parent’s fair share only to active queues.
> So in the example above,since childQ1 is the only active queue
> under root.HighPriorityQueue, it would get all its parent’s fair share i.e. 80%.
> This would cause preemption to reclaim the 30% needed by childQ1 from root.lowPriorityQueue
after fairSharePreemptionTimeout seconds.
> Problem2 - Also note that similar situation can happen between root.HighPriorityQueue.childQ1
and root.HighPriorityQueue.childQ2,if childQ2 hogs the cluster. childQ2 can take up 95% cluster
and childQ1 would be stuck at 5%,until childQ2 starts relinquishing containers. We would like
each of childQ1 and childQ2 to get half of root.HighPriorityQueue  fair share ie 40%,which
would ensure childQ1 gets upto 40% resource if needed through preemption.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message