Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 2717F200B49 for ; Tue, 14 Jun 2016 03:18:00 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 25CED160A5B; Tue, 14 Jun 2016 01:18:00 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 757B6160A62 for ; Tue, 14 Jun 2016 03:17:59 +0200 (CEST) Received: (qmail 29999 invoked by uid 500); 14 Jun 2016 01:17:58 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 29578 invoked by uid 99); 14 Jun 2016 01:17:58 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 14 Jun 2016 01:17:58 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 0D63F2C1F61 for ; Tue, 14 Jun 2016 01:17:58 +0000 (UTC) Date: Tue, 14 Jun 2016 01:17:58 +0000 (UTC) From: "Bibin A Chundatt (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (YARN-4618) RM Stops allocating containers if large number of pending containers MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Tue, 14 Jun 2016 01:18:00 -0000 [ https://issues.apache.org/jira/browse/YARN-4618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15328754#comment-15328754 ] Bibin A Chundatt edited comment on YARN-4618 at 6/14/16 1:17 AM: ----------------------------------------------------------------- [~leftnoteasy] Can we close this jira since YARN-4844 implemented the same was (Author: bibinchundatt): [~leftnoteasy] Can we close this jira since YARN-4844 took implemented the same > RM Stops allocating containers if large number of pending containers > -------------------------------------------------------------------- > > Key: YARN-4618 > URL: https://issues.apache.org/jira/browse/YARN-4618 > Project: Hadoop YARN > Issue Type: Bug > Reporter: Bibin A Chundatt > Assignee: Bibin A Chundatt > Priority: Critical > > In one of the test found that when RM is having so many pending container request to be served RM Stops assigning containers. > Cluster simulated is with 100 TB > Root total = 60k containers = > Queue 1 = 30k containers = 1328800000 MB > Queue 2 = 30k containers = 1428800000 MB > Each container request is with 40GB. > {{ParentQueue#assignContainers}} is as below > {noformat} > // Check if this queue need more resource, simply skip allocation if this > // queue doesn't need more resources. > if (!super.hasPendingResourceRequest(node.getPartition(), > clusterResource, schedulingMode)) { > if (LOG.isDebugEnabled()) { > LOG.debug("Skip this queue=" + getQueuePath() > + ", because it doesn't need more resource, schedulingMode=" > + schedulingMode.name() + " node-partition=" + node.getPartition()); > } > return CSAssignment.NULL_ASSIGNMENT; > } > {noformat} > When the pending resource > MAX VALUE and become *negative* {{- 167XXXXXXX MB}} and always NULL_ASSIGNMENT is return. > Tools used to test SLS. > For checking pendingResource request we should first check any pending containers (from getMetrics()) are there to be served. If pending containers are available then return true else consider other check for increase request. > Thoughts ?? -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: yarn-issues-help@hadoop.apache.org