Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 4A344100C3 for ; Wed, 11 Jun 2014 15:56:07 +0000 (UTC) Received: (qmail 84381 invoked by uid 500); 11 Jun 2014 15:56:03 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 84156 invoked by uid 500); 11 Jun 2014 15:56:03 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 84123 invoked by uid 99); 11 Jun 2014 15:56:03 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Jun 2014 15:56:03 +0000 Date: Wed, 11 Jun 2014 15:56:03 +0000 (UTC) From: "Jonathan Eagles (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-1198) Capacity Scheduler headroom calculation does not work as expected MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-1198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14027938#comment-14027938 ] Jonathan Eagles commented on YARN-1198: --------------------------------------- Since headroom calculation is used reducer preemption, I have seen issues with these bugs causes queue deadlock where multi-job queue is full of reducers that can't finish since the mappers can't run due to reducers having higher task priority. Preemption doesn't kill reducers since headroom falsely shows there is plenty of room in the queue for mappers to run. > Capacity Scheduler headroom calculation does not work as expected > ----------------------------------------------------------------- > > Key: YARN-1198 > URL: https://issues.apache.org/jira/browse/YARN-1198 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Omkar Vinit Joshi > Assignee: Omkar Vinit Joshi > > Today headroom calculation (for the app) takes place only when > * New node is added/removed from the cluster > * New container is getting assigned to the application. > However there are potentially lot of situations which are not considered for this calculation > * If a container finishes then headroom for that application will change and should be notified to the AM accordingly. > * If a single user has submitted multiple applications (app1 and app2) to the same queue then > ** If app1's container finishes then not only app1's but also app2's AM should be notified about the change in headroom. > ** Similarly if a container is assigned to any applications app1/app2 then both AM should be notified about their headroom. > ** To simplify the whole communication process it is ideal to keep headroom per User per LeafQueue so that everyone gets the same picture (apps belonging to same user and submitted in same queue). > * If a new user submits an application to the queue then all applications submitted by all users in that queue should be notified of the headroom change. > * Also today headroom is an absolute number ( I think it should be normalized but then this is going to be not backward compatible..) > * Also when admin user refreshes queue headroom has to be updated. > These all are the potential bugs in headroom calculations -- This message was sent by Atlassian JIRA (v6.2#6252)