Return-Path: Delivered-To: apmail-hadoop-mapreduce-issues-archive@minotaur.apache.org Received: (qmail 63628 invoked from network); 20 Aug 2009 05:36:32 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 20 Aug 2009 05:36:32 -0000 Received: (qmail 49389 invoked by uid 500); 20 Aug 2009 05:36:51 -0000 Delivered-To: apmail-hadoop-mapreduce-issues-archive@hadoop.apache.org Received: (qmail 49332 invoked by uid 500); 20 Aug 2009 05:36:51 -0000 Mailing-List: contact mapreduce-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: mapreduce-issues@hadoop.apache.org Delivered-To: mailing list mapreduce-issues@hadoop.apache.org Received: (qmail 49322 invoked by uid 99); 20 Aug 2009 05:36:50 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 20 Aug 2009 05:36:50 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 20 Aug 2009 05:36:36 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id CDFFC234C044 for ; Wed, 19 Aug 2009 22:36:14 -0700 (PDT) Message-ID: <286892169.1250746574830.JavaMail.jira@brutus> Date: Wed, 19 Aug 2009 22:36:14 -0700 (PDT) From: "Hemanth Yamijala (JIRA)" To: mapreduce-issues@hadoop.apache.org Subject: [jira] Commented: (MAPREDUCE-824) Support a hierarchy of queues in the capacity scheduler In-Reply-To: <1546817096.1249393155507.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/MAPREDUCE-824?page=3Dcom.atlass= ian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D1= 2745312#action_12745312 ]=20 Hemanth Yamijala commented on MAPREDUCE-824: -------------------------------------------- Some comments: - AbstractQueue.updateContext can move into QueueSchedulingContext, as all = the state it is operating on is in QSC.=20 - Also prevMapClusterCapacity and prevReduceClusterCapacity can also be mov= ed to the context. They can be private, and renamed to prev*Capacity, dropp= ing the 'Cluster' because for container queues, they don't reflect the enti= re cluster capacity. Same naming change would apply to variables in QueueSc= hedulingContext (like setMapClusterCapacity, etc) - AbstractQueue.getOrderedJobQueues, its not very clear that this is lookin= g through the entire hierarchy. Also, it assumes that sorting is done befor= e this. So, its not a very orthogonal API. Move this to the scheduler, and = introduce a new API like AbstractQueue.getDescendentJobQueues(). - Override AbstractQueue.addChildren in JobQueue to throw an unsupported ex= ception. - Make AbstractQueue.getChildren package private and document it is for tes= ts. - I suggest we modify the algorithm in distributeUnConfiguredCapacity to fo= llow this pattern to make it clearer: {code} for (Queue q : children) { if (q.capacity =3D=3D -1) { unconfigured.add(q); } } // distribute capacity for all unconfigured queues. for (Queue q : children) { q.distributeUnconfiguredCapacity(); } {code} - I would suggest we provide equals and hashCode in AbstractQueue to be bas= ed on the queue Name. toString in AbstractQueue should print the queue name= . - I didn't understand the need for setting the capacity in conf in distribu= teUnconfiguredCapacity. It seems like requiring the Configuration instance = to be passed to distributeUnconfiguredCapacity is creating an undesirable d= ependency. Can you check if we can break this dependency. - distributeUnConfiguredCapacity will throw a Divide by zero if there is no= queues without configured capacity. - We don't need to pass the supportsPriority variable separately to the Job= Queue's constructor. Let's set that directly in the JobQueue.QueueSchedulin= gContext which we are already passing to JobQueue. - In JobQueue, methods like addWaitingJob etc should be private. Also, I th= ink some of the methods can be folded. For e.g. makeJobRunning just calls a= ddRunningJob, so we can refactor to remove makeJobRunning and call addRunni= ngJob directly. - TaskData seems out of place in TaskSchedulingContext. The scheduling cont= ext contains state w.r.to scheduler. TaskData is a simple abstraction that = returns a view of job information based on the task type. So, let's pull it= out and call it TaskDataView which can be extended by MapTaskDataView and = ReduceTaskDataView. There should be only one =E2=80=8Cinstance of these per= scheduler instance and they can be got from the scheduler itself. - Rename TaskSchedulingContext.add to TSC.update. - Can we pull out the whole hierarchy building logic into a separate class = - like a QueueHierarchyBuilder ? It could be given the CapacitySchedulerCon= f and QueueManager and have an API like buildHierarchy - which would return= the root of the queues. Capacity scheduler can thus be abstracted from how= the hierarchy is created - it just gets the hierarchy from somewhere. For = e.g. in tests, the hierarchy can be manually created and given to the given= . - Please remove mapScheduler.initialize() and reduceScheduler.initialize(). - tsi.getMaxCapacity() < tsi.getCapacity(): this check in areTasksInQueueOv= erLimit does not seem required. Because the check is already being done in = tsi.getCapacity() - totalCapacity modification in the loadContext is a no-op, because the cha= nges will not be reflected in the caller method. Likewise the check for tot= alCapacity > 100.0 is a no-op in createHierarchy. - The separator char for queues is chosen to be '.' in createHierarchy. It = must be checked that this character doesn't appear anywhere else in the que= ue name. - Call to root.sort() should be from TaskSchedulingMgr.assignTasks() - JobQueuesManager.createQueue should be addQueue. Also, it can get the que= ue name from the job queue object directly, and doesn't need the extra para= meter. - JobQueueManager.getQueueNames can be getJobQueueNames. > Support a hierarchy of queues in the capacity scheduler > ------------------------------------------------------- > > Key: MAPREDUCE-824 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-824 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Components: contrib/capacity-sched > Reporter: Hemanth Yamijala > Attachments: HADOOP-824-1.patch, HADOOP-824-2.patch, HADOOP-824-3= .patch > > > Currently in Capacity Scheduler, cluster capacity is divided among the qu= eues based on the queue capacity. These queues typically represent an organ= ization and the capacity of the queue represents the capacity the organizat= ion is entitled to. Most organizations are large and need to divide their c= apacity among sub-organizations they have. Or they may want to divide the c= apacity based on a category or type of jobs they run. This JIRA covers the = requirements and other details to provide the above feature. --=20 This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.