Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 2B8C818585 for ; Thu, 21 Jan 2016 07:32:40 +0000 (UTC) Received: (qmail 80585 invoked by uid 500); 21 Jan 2016 07:32:40 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 80552 invoked by uid 500); 21 Jan 2016 07:32:40 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 80537 invoked by uid 99); 21 Jan 2016 07:32:39 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 21 Jan 2016 07:32:39 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id C7BCB2C1F5B for ; Thu, 21 Jan 2016 07:32:39 +0000 (UTC) Date: Thu, 21 Jan 2016 07:32:39 +0000 (UTC) From: "Naganarasimha G R (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-4618) RM Stops allocating containers if large number of pending containers MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-4618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15110218#comment-15110218 ] Naganarasimha G R commented on YARN-4618: ----------------------------------------- [~hex108], Whats the max number of containers run in your cluster ? i remember it to be a very large value, have you faced any problem like this any where ? > RM Stops allocating containers if large number of pending containers > -------------------------------------------------------------------- > > Key: YARN-4618 > URL: https://issues.apache.org/jira/browse/YARN-4618 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Bibin A Chundatt > Assignee: Bibin A Chundatt > Priority: Critical > > In one of the test found that when RM is having so many pending container request to be served RM Stops assigning containers. > Root total = 6 lakhs containers = > Queue 1 = 3 lakh containers = 1328800000 MB > Queue 2 = 3+ lakh containers = 1428800000 MB > Each container request is with 4GB. > {{ParentQueue#assignContainers}} is as below > {noformat} > // Check if this queue need more resource, simply skip allocation if this > // queue doesn't need more resources. > if (!super.hasPendingResourceRequest(node.getPartition(), > clusterResource, schedulingMode)) { > if (LOG.isDebugEnabled()) { > LOG.debug("Skip this queue=" + getQueuePath() > + ", because it doesn't need more resource, schedulingMode=" > + schedulingMode.name() + " node-partition=" + node.getPartition()); > } > return CSAssignment.NULL_ASSIGNMENT; > } > {noformat} > When the pending resource > MAX VALUE and become *negative* {{- 167XXXXXXX MB}} and always NULL_ASSIGNMENT is return. > Tools used to test SLS. > For checking pendingResource request we should first check any pending containers (from getMetrics()) are there to be served. If pending containers are available then return true else consider other check for increase request. > Thoughts ?? -- This message was sent by Atlassian JIRA (v6.3.4#6332)