Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id AEBAD18127 for ; Sun, 30 Aug 2015 04:50:46 +0000 (UTC) Received: (qmail 72888 invoked by uid 500); 30 Aug 2015 04:50:46 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 72843 invoked by uid 500); 30 Aug 2015 04:50:46 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 72828 invoked by uid 99); 30 Aug 2015 04:50:46 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 30 Aug 2015 04:50:46 +0000 Date: Sun, 30 Aug 2015 04:50:46 +0000 (UTC) From: "Shiwei Guo (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (YARN-3933) Race condition when calling AbstractYarnScheduler.completedContainer. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shiwei Guo updated YARN-3933: ----------------------------- Attachment: YARN-3933.001.patch > Race condition when calling AbstractYarnScheduler.completedContainer. > --------------------------------------------------------------------- > > Key: YARN-3933 > URL: https://issues.apache.org/jira/browse/YARN-3933 > Project: Hadoop YARN > Issue Type: Bug > Components: resourcemanager > Affects Versions: 2.6.0, 2.7.0, 2.5.2, 2.7.1 > Reporter: Lavkesh Lahngir > Assignee: Shiwei Guo > Labels: patch > Attachments: YARN-3933.001.patch > > > In our cluster we are seeing available memory and cores being negative. > Initial inspection: > Scenario no. 1: > In capacity scheduler the method allocateContainersToNode() checks if > there are excess reservation of containers for an application, and they are no longer needed then it calls queue.completedContainer() which causes resources being negative. And they were never assigned in the first place. > I am still looking through the code. Can somebody suggest how to simulate excess containers assignments ? -- This message was sent by Atlassian JIRA (v6.3.4#6332)