Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F1D0717636 for ; Tue, 1 Sep 2015 15:13:49 +0000 (UTC) Received: (qmail 36252 invoked by uid 500); 1 Sep 2015 15:13:46 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 36207 invoked by uid 500); 1 Sep 2015 15:13:46 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 36193 invoked by uid 99); 1 Sep 2015 15:13:46 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Sep 2015 15:13:46 +0000 Date: Tue, 1 Sep 2015 15:13:46 +0000 (UTC) From: "MENG DING (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-1651) CapacityScheduler side changes to support increase/decrease container resource. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14725533#comment-14725533 ] MENG DING commented on YARN-1651: --------------------------------- Hi, [~leftnoteasy], thanks so much for posting the patch. I do have one question regarding the patch. Recall during the design discussion, we agreed that as long as an increase has not yet completed for a container, we should not process any other increase/decrease requests for the same container. It seems that this patch will still process decrease/increase requests even an increase action is ongoing? If the following sequence of events happen: Example 1: 1. AM sends container increase request to RM 2. RM allocates the resource and gives out increase token to AM 3. AM sends decrease request to RM for the same container 4. AM uses the increase token to increase resource on NM 5. NM reports container status back to RM IIUC, at step 3, this patch will decrease the container size, and remove the container from allocation expirer. At step 5, this patch will see that the RM container size is smaller than the reported NM container size, and will tell NM to decrease the container resource. The concern I have with this approach is that in step 4, the user will think that the increase is successfully done in NM, but in fact it won't. Also, what will happen in the following sequence of events? Example 2: 1. AM sends container increase request to RM 2. RM allocates the resource and gives out increase token (token1) to AM 3. AM sends a new container increase request for the same container to RM with more resource 4. RM allocates the resource and gives out increase token (token2) to AM 5. AM uses token1 (the one with smaller size) to increase resource on NM, but not token2 IIUC, when RM receives the increase report from NM, it will find out that the RM container size is larger than the reported NM container size, and do nothing about it, later on when token2 expires, the entire container will be killed according to the current implementation. I think this behavior could be quite confusing to the user. IMHO, at least for the case in example 2, we should delay processing of the second increase request until the first increase action is completed. > CapacityScheduler side changes to support increase/decrease container resource. > ------------------------------------------------------------------------------- > > Key: YARN-1651 > URL: https://issues.apache.org/jira/browse/YARN-1651 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager, scheduler > Reporter: Wangda Tan > Assignee: Wangda Tan > Attachments: YARN-1651-1.YARN-1197.patch, YARN-1651-WIP.YARN-1197.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)