Return-Path: Delivered-To: apmail-hadoop-core-dev-archive@www.apache.org Received: (qmail 69050 invoked from network); 14 Nov 2008 10:03:07 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 14 Nov 2008 10:03:07 -0000 Received: (qmail 20688 invoked by uid 500); 14 Nov 2008 10:03:12 -0000 Delivered-To: apmail-hadoop-core-dev-archive@hadoop.apache.org Received: (qmail 20650 invoked by uid 500); 14 Nov 2008 10:03:12 -0000 Mailing-List: contact core-dev-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: core-dev@hadoop.apache.org Delivered-To: mailing list core-dev@hadoop.apache.org Received: (qmail 20639 invoked by uid 99); 14 Nov 2008 10:03:12 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 14 Nov 2008 02:03:12 -0800 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 14 Nov 2008 10:02:00 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id ACCE4234C28A for ; Fri, 14 Nov 2008 02:02:44 -0800 (PST) Message-ID: <1341927188.1226656964706.JavaMail.jira@brutus> Date: Fri, 14 Nov 2008 02:02:44 -0800 (PST) From: "Amar Kamat (JIRA)" To: core-dev@hadoop.apache.org Subject: [jira] Commented: (HADOOP-4658) User limit is not expanding back properly. In-Reply-To: <1558316807.1226651204195.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HADOOP-4658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12647558#action_12647558 ] Amar Kamat commented on HADOOP-4658: ------------------------------------ Here is the scenario _Setup :_ j1 and j2 are 2 jobs submitted by user user1 and user2 resp and run for longer time. j3 and j4 are 2 jobs submitted by user user3 and user4 resp and run for less time . Only "default" queue is used. _Analysis :_ ||jobs||num-running-tasks||limit||who will schedule||comment|| |j1, j2, j3, j4|25, 25, 25, 25| 25| j1 |everyone is over limit| |j1, j2, j3*, j4|40, 25, 10, 25| 25| j1 |everyone is over limit and job3 is finishing off| |j1, j2, j4|50, 25, 25| 33| j2 |only user1 is over limit as limit is re-defined| |j1, j2, j4*|50, 33, 17| 33| j1 |everyone is over limit and job4 is finishing off| |j1, j2|67, 33| 50| j2 |user1 is over limit as limits is redefined| Note that j1 might actually have 50 maps to run but then is can run 67 with this setup. So it can go ahead an run 17 speculative tasks instead of job2 running its 17 genuine tasks. With the proposed change ||jobs||num-running-tasks||limit||who will schedule||comment|| |j1, j2, j3, j4|25, 25, 25, 25| 25| j1 |everyone is over limit| |j1, j2, j3*, j4|33, 25, 17, 25| 33| j2|user1 is over limit as limit is redefined and job3 is finishing off, slots will be equally divided| |j1, j2, j3*, j4|33, 33, 9, 25| 33| j1 |everyone is overlimit and job3 is finishing off| |j1, j2, j4|42, 33, 25| 33| j1 |everyone is overlimit| |j1, j2, j4*|50, 33, 17| 50| j2 |as user1 is over limit as limit are redefined and job4 is finishing off| |j1, j2|50, 50| 50| j1 |as everyone is overlimit| > User limit is not expanding back properly. > ------------------------------------------ > > Key: HADOOP-4658 > URL: https://issues.apache.org/jira/browse/HADOOP-4658 > Project: Hadoop Core > Issue Type: Bug > Components: contrib/capacity-sched > Affects Versions: 0.19.0 > Environment: GC=100% nodes=104, map_capacity=208, reduce_capacity=208, user-limit=25%; > Reporter: Karam Singh > Assignee: Amar Kamat > > User limit is not expanding back properly. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.