hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hemanth Yamijala (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-824) Support a hierarchy of queues in the capacity scheduler
Date Tue, 04 Aug 2009 13:51:14 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12738975#action_12738975

Hemanth Yamijala commented on MAPREDUCE-824:

Here is a proposal that came out after some discussions for consideration:

h3. User Scenario
Consider an organization "Org1". Suppose they have a queue which is allocated a capacity of
60% of the cluster resources. Org1 has the following types of jobs: 
- Production jobs 
- Research jobs - belonging to three projects - say Proj1, Proj2 and Proj3 
- Miscellaneous types of jobs. 

Org1 would like to have a greater control over how the 60% capacity is used effectively amongst
these types of jobs. For e.g. they'd like the research jobs and miscellaneous jobs to get
some minimum amount of capacity for their work, and a significant share of the capacity to
be alloted to production jobs. Production jobs are submitted somewhat rarely to the cluster.
Whenever they are submitted, they must get the first chance to run, followed by research jobs
and then miscellaneous jobs. However, whenever there are no production jobs, the research
jobs must get this unused capacity.

h3. A Sample Configuration 

The following is an illustrative expression of the features/changes we plan to implement in
the capacity scheduler to solve the use cases mentioned in the User Scenario. The actual specification
may be different from the one described here. However, effort will be made to keep it as close
to this expression as possible. 
grid {
  Org1 min=60% {
    priority min=90% {
      production min=82%
      proj1 min=6% max=10%
      proj2 min=6%
      proj3 min=6%
    miscellaneous min=10%
  Org2 min=40%
h3. Description

The features introduced in the sample configuration are described below. 

h4. Sub-queues 

- Sub-queues are queues configured within queues. They provide a mechanism for administrators
to link logically related queues (say on the basis of the organization they are set up for).

- Using sub-queues, administrators would primarily be able to use capacity allocated to their
organization more effectively than in the current model. 
- For instance, in the current implementation which has only a single level of queues, if
there is unused capacity in a queue, it is divided among remaining queues in proportion to
their capacities. This has the disadvantage that logically unrelated queues could stake a
claim to this unused capacity even if the organization itself is being under-served. However,
by defining a hierarchy, administrators can ensure that unused capacity can be first alloted
to sub-queues within the organization at the same level. 
- Sub-queues can be nested. So there can be queues within a sub-queue. 
- The last queue in a hierarchy of queues is the only queue to which jobs can be submitted.
In this JIRA, we'll call it leaf level queue. 

h4. Minimum Capacities 

- A minimum capacity of a sub-queue defines what percentage of the parent's capacity it is
entitled to. 
- The scheduler will pick up a sub-queue to service if it's furthest away from meeting its
minimum capacity. The current definition of being furthest away is a function of currently
used capacity and minimum capacity. (used-capacity/minimum-capacity) 
- By default, a queue's capacity usage is elastic and will go beyond the minimum capacity
if there is unused capacity. 
Leaf level queues that are fairly important, and their parent sub-queues recursively must
be granted a high minimum percentage to ensure the scheduler chooses them first. 
- The minimum capacity of a queue can be zero. Setting to zero means that this queue will
be serviced ONLY IF no other queue at the same level has anything more to run (irrespective
of current usage). 
- If the minimum capacity for a queue is not specified, there are two choices: 
-- Defaults to zero 
-- Defaults to 100 - sum (minimum capacities for queues at the same level). 
--- If more than one queue has no minimum specified, this value will be equally split among
all queues. 

h4. Maximum Capacity 

- A maximum capacity defines a limit beyond which a sub-queue cannot use the capacity of its
parent queue. 
- This provides a means to limit how much excess capacity a sub-queue can use. By default,
there is no limit. 
- A typical use case for using a maximum capacity limit could be to curtail certain jobs which
are long running in nature from occupying more than a certain percentage of the cluster, which
in the absence of pre-emption, could lead to capacity guarantees of other queues being affected.

- Naturally, the maximum capacity of a queue can only be greater than or equal to its minimum

h4. Scheduling among sub-queues 

When a tasktracker comes to get a task, the scheduler uses the following algorithm to assign
a task to the tracker. 
- Sort all sub-queues at the first level according to a function of used-capacity and minimum-capacity.

- Pick up the most needy queue, and see if it has work to do. 
-- If this is a leaf level queue with pending tasks, a task is picked up from the jobs in
the queue as long as it is within any maximum limits for the queue. 
-- If this is not a leaf level queue, apply the algorithm recursively among sub-queues under
this queue. 
- If no task is found, move on to the next queue in the list.

> Support a hierarchy of queues in the capacity scheduler
> -------------------------------------------------------
>                 Key: MAPREDUCE-824
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-824
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: contrib/capacity-sched
>            Reporter: Hemanth Yamijala
> Currently in Capacity Scheduler, cluster capacity is divided among the queues based on
the queue capacity. These queues typically represent an organization and the capacity of the
queue represents the capacity the organization is entitled to. Most organizations are large
and need to divide their capacity among sub-organizations they have. Or they may want to divide
the capacity based on a category or type of jobs they run. This JIRA covers the requirements
and other details to provide the above feature.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message