hadoop-mapreduce-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hemanth Yamijala (JIRA)" <j...@apache.org>
Subject [jira] Commented: (MAPREDUCE-824) Support a hierarchy of queues in the capacity scheduler
Date Tue, 25 Aug 2009 03:27:59 GMT

    [ https://issues.apache.org/jira/browse/MAPREDUCE-824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12747207#action_12747207

Hemanth Yamijala commented on MAPREDUCE-824:

This is getting better. I do have some more feedback:

- updateStatsOnRunningJob, addRunningJob, removeRunningJob, removeWaitingJob - make private
- ASF licence header should be the first in the src file.
- Replace sortJobQueues with inline method.
- QueueHierarchyBuilder is creating a new instance of the CapacityTaskScheduler, which is
- static builder instance also seems unnecessary.
- In QueueHierarchyBuilder, when checking for separator char, IllegalArgumentException must
show the queue name which failed the check.
- Discuss: Back dependency between QueueHierarchyBuilder and Scheduler - can this be avoided.
- AbstractQueue does not override equals, while hashcode is overridden. Also, the toString
API was previously printing other information. I'd only asked the name of the queue to be
prepended to it, not to remove the other information.
- It is a little confusing that the number of slots being asserted after task assignment does
not include the currently scheduled task. Recommend to move the asserts before assignment.
- Root should always be set up only in a certain way. I would recommend, there's a single
static instance of root, which is always got from the capacity scheduler, even in tests.
- In testMaxCapacity, rt.update in tests should send in the capacity of the clusters to be
in sync.
- getTaskDataView() need not be in TaskSchedulingContext. Since it is static, it can be called
directly from other classes like the scheduler, passing the type.
- AbstractQueue.addChildren should be addChild.

Some of the earlier comments are not taken:
- APIs in JobQueuesManager and JobQueue can be folded still.
- mapTSI and reduceTSI member variables of JobQueue are not needed.
- AbstractQueue.getChildren is still public
- getCapacity() should not return max capacity any time. It should always return the current
capacity or limit, whichever is smaller.

> Support a hierarchy of queues in the capacity scheduler
> -------------------------------------------------------
>                 Key: MAPREDUCE-824
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-824
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: contrib/capacity-sched
>            Reporter: Hemanth Yamijala
>         Attachments: HADOOP-824-1.patch, HADOOP-824-2.patch, HADOOP-824-3.patch, HADOOP-824-4.patch,
> Currently in Capacity Scheduler, cluster capacity is divided among the queues based on
the queue capacity. These queues typically represent an organization and the capacity of the
queue represents the capacity the organization is entitled to. Most organizations are large
and need to divide their capacity among sub-organizations they have. Or they may want to divide
the capacity based on a category or type of jobs they run. This JIRA covers the requirements
and other details to provide the above feature.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message