hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hemanth Yamijala (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-824) Support a hierarchy of queues in the capacity scheduler
Date Thu, 20 Aug 2009 05:36:14 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12745312#action_12745312
] 

Hemanth Yamijala commented on MAPREDUCE-824:
--------------------------------------------

Some comments:
- AbstractQueue.updateContext can move into QueueSchedulingContext, as all the state it is
operating on is in QSC. 
- Also prevMapClusterCapacity and prevReduceClusterCapacity can also be moved to the context.
They can be private, and renamed to prev*Capacity, dropping the 'Cluster' because for container
queues, they don't reflect the entire cluster capacity. Same naming change would apply to
variables in QueueSchedulingContext (like setMapClusterCapacity, etc)
- AbstractQueue.getOrderedJobQueues, its not very clear that this is looking through the entire
hierarchy. Also, it assumes that sorting is done before this. So, its not a very orthogonal
API. Move this to the scheduler, and introduce a new API like AbstractQueue.getDescendentJobQueues().
- Override AbstractQueue.addChildren in JobQueue to throw an unsupported exception.
- Make AbstractQueue.getChildren package private and document it is for tests.
- I suggest we modify the algorithm in distributeUnConfiguredCapacity to follow this pattern
to make it clearer:
{code}
for (Queue q : children) {
  if (q.capacity == -1) {
    unconfigured.add(q);
  }
}

// distribute capacity for all unconfigured queues.

for (Queue q : children) {
  q.distributeUnconfiguredCapacity();
}
{code}
- I would suggest we provide equals and hashCode in AbstractQueue to be based on the queue
Name. toString in AbstractQueue should print the queue name.
- I didn't understand the need for setting the capacity in conf in distributeUnconfiguredCapacity.
It seems like requiring the Configuration instance to be passed to distributeUnconfiguredCapacity
is creating an undesirable dependency. Can you check if we can break this dependency.
- distributeUnConfiguredCapacity will throw a Divide by zero if there is no queues without
configured capacity.
- We don't need to pass the supportsPriority variable separately to the JobQueue's constructor.
Let's set that directly in the JobQueue.QueueSchedulingContext which we are already passing
to JobQueue.
- In JobQueue, methods like addWaitingJob etc should be private. Also, I think some of the
methods can be folded. For e.g. makeJobRunning just calls addRunningJob, so we can refactor
to remove makeJobRunning and call addRunningJob directly.
- TaskData seems out of place in TaskSchedulingContext. The scheduling context contains state
w.r.to scheduler. TaskData is a simple abstraction that returns a view of job information
based on the task type. So, let's pull it out and call it TaskDataView which can be extended
by MapTaskDataView and ReduceTaskDataView. There should be only one ‚ÄĆinstance of these per
scheduler instance and they can be got from the scheduler itself.
- Rename TaskSchedulingContext.add to TSC.update.
- Can we pull out the whole hierarchy building logic into a separate class - like a QueueHierarchyBuilder
? It could be given the CapacitySchedulerConf and QueueManager and have an API like buildHierarchy
- which would return the root of the queues. Capacity scheduler can thus be abstracted from
how the hierarchy is created - it just gets the hierarchy from somewhere. For e.g. in tests,
the hierarchy can be manually created and given to the given.
- Please remove mapScheduler.initialize() and reduceScheduler.initialize().
- tsi.getMaxCapacity() < tsi.getCapacity(): this check in areTasksInQueueOverLimit does
not seem required. Because the check is already being done in tsi.getCapacity()
- totalCapacity modification in the loadContext is a no-op, because the changes will not be
reflected in the caller method. Likewise the check for totalCapacity > 100.0 is a no-op
in createHierarchy.
- The separator char for queues is chosen to be '.' in createHierarchy. It must be checked
that this character doesn't appear anywhere else in the queue name.
- Call to root.sort() should be from TaskSchedulingMgr.assignTasks()
- JobQueuesManager.createQueue should be addQueue. Also, it can get the queue name from the
job queue object directly, and doesn't need the extra parameter.
- JobQueueManager.getQueueNames can be getJobQueueNames.


> Support a hierarchy of queues in the capacity scheduler
> -------------------------------------------------------
>
>                 Key: MAPREDUCE-824
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-824
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: contrib/capacity-sched
>            Reporter: Hemanth Yamijala
>         Attachments: HADOOP-824-1.patch, HADOOP-824-2.patch, HADOOP-824-3.patch
>
>
> Currently in Capacity Scheduler, cluster capacity is divided among the queues based on
the queue capacity. These queues typically represent an organization and the capacity of the
queue represents the capacity the organization is entitled to. Most organizations are large
and need to divide their capacity among sub-organizations they have. Or they may want to divide
the capacity based on a category or type of jobs they run. This JIRA covers the requirements
and other details to provide the above feature.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message