Return-Path: Delivered-To: apmail-hadoop-common-issues-archive@minotaur.apache.org Received: (qmail 11705 invoked from network); 23 Feb 2010 18:46:51 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 23 Feb 2010 18:46:51 -0000 Received: (qmail 91969 invoked by uid 500); 23 Feb 2010 18:46:51 -0000 Delivered-To: apmail-hadoop-common-issues-archive@hadoop.apache.org Received: (qmail 91905 invoked by uid 500); 23 Feb 2010 18:46:50 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: common-issues@hadoop.apache.org Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 91895 invoked by uid 99); 23 Feb 2010 18:46:50 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 Feb 2010 18:46:50 +0000 X-ASF-Spam-Status: No, hits=-1999.6 required=10.0 tests=ALL_TRUSTED,SUBJECT_FUZZY_TION X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 23 Feb 2010 18:46:49 +0000 Received: from brutus.apache.org (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id 1C487234C4A8 for ; Tue, 23 Feb 2010 10:46:28 -0800 (PST) Message-ID: <847062510.467901266950788114.JavaMail.jira@brutus.apache.org> Date: Tue, 23 Feb 2010 18:46:28 +0000 (UTC) From: "Matei Zaharia (JIRA)" To: common-issues@hadoop.apache.org Subject: [jira] Commented: (HADOOP-6592) Scheduler: Pause button desirable In-Reply-To: <617274241.454951266907048009.JavaMail.jira@brutus.apache.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-6592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12837379#action_12837379 ] Matei Zaharia commented on HADOOP-6592: --------------------------------------- Hi Adam, The fair scheduler and capacity scheduler can be used to better share a cluster between large and small jobs. Instead of running jobs in FIFO order as the default scheduler does, they allow each job to have a certain share of the slots in the cluster. For example, if your cluster has 1000 slots, and you submit 1 job, it gets all the slots; but if you then submit a second job, each job's share becomes 500 slots (as tasks from job 1 finish, their slots are given to job 2). These schedulers work quite well when tasks are relatively short (10 seconds to a minute); a new job can get slots within a few seconds. The only case when you may need to do something beyond waiting for existing tasks to finish is when all your reduce slots are filled by long reduce slots, but for this case, the fair scheduler at least supports preemption in 0.21 and trunk (I believe the capacity scheduler had it in earlier versions but has now removed it; I could be wrong about that though). > Scheduler: Pause button desirable > --------------------------------- > > Key: HADOOP-6592 > URL: https://issues.apache.org/jira/browse/HADOOP-6592 > Project: Hadoop Common > Issue Type: Wish > Reporter: Adam Kramer > Priority: Minor > > It would be lovely if, from the jobtracker page, I could click a button that's not "kill" or "fail" but ..."pause." > The pause button would stop a certain task from starting any more mappers or reducers. They would all wait in the "pending" stage until the job is "un-paused." Currently-running tasks would continue to run, and then complete, thus freeing the resources for other jobs. > This would help a lot for systems (esp. Hive) in which one or two jobs are hogging a lot of mappers or reducers. The ones they have would finish, and then other jobs could "catch up," and then they could be unpaused for a while. This would also allow for user-level throttling of their jobs in instances where they need a lot of resources but have the time to spare. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.