Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id A8D3410D08 for ; Thu, 3 Oct 2013 15:48:50 +0000 (UTC) Received: (qmail 37423 invoked by uid 500); 3 Oct 2013 15:48:48 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 37329 invoked by uid 500); 3 Oct 2013 15:48:48 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 37088 invoked by uid 99); 3 Oct 2013 15:48:45 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Oct 2013 15:48:45 +0000 Date: Thu, 3 Oct 2013 15:48:45 +0000 (UTC) From: "Alejandro Abdelnur (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Comment Edited] (YARN-1197) Support changing resources of an allocated container MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-1197?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13785293#comment-13785293 ] Alejandro Abdelnur edited comment on YARN-1197 at 10/3/13 3:47 PM: ------------------------------------------------------------------- [~gp.leftnoteasy], thanks for your previous answer, it makes sense. We've been thinking about this a while ago in the context of Llama for Impala-Yarn integration. Along the lines of what Sandy suggested, just a couple of extra comments. For decreasing AM can request the correction, effective immediately to the NM. the NM reports the container correction and new free space to the RM in the next heartbeat. Regarding enforcing minimum, configuration properties are scheduler specific, so the minimum should have to come to the NM from the RM as part of the registration response. For increasing the AM must go to the RM first o avoid the race conditions already mentioned. To reduce the changes in the RM to a minimum I was thinking the following approach: * AM does a regular new allocation request with the desired delta capabilities increases with relaxedLocality=false (no changes on the AM-RM protocol/logic). * AM waits for the delta container allocation from the RM. * When AM receives the delta container allocation, using a new AM-NM API, it updates the original container with the delta container. * The NM makes the necessary corrections locally to the original container adding the capabilities o the delta container. * The NM notifies the RM to merge the original container with the delta container. * The RM updates the original container and drops the delta container. The complete list of changes for this approach would be: * AM-NM API ** decreaseContainer(ContainerId original, Resources) ** increateContainer(ContainerId original, ContainerId delta) * NM-RM API ** decreaseContainer(ContainerId original, Resources) ** registration() -> +minimumcontainersize ** mergeContainers(ContainerId originalKeep, ContainerId deltaDiscard) * NM logic ** needs to correct capabilities enforcement for +/- delta * RM logic ** needs to update container resources when receiving a NM's decreaseContainer() call ** needs to update original container resources and delete delta container resources when receiving a NM's mergeContainer() call * RM scheduler API ** it should expose methods for decreaseContainer() and mergeContainers() functionality was (Author: tucu00): [~gp.leftnoteasy], thanks for your previous answer, it makes sense. We've been thinking about this a while ago in the context of Llama for Impala-Yarn integration. Along the lines of what Sandy suggested, just a couple of extra comments. For decreasing AM can request the correction, effective immediately to the NM. the NM reports the container correction and new free space to the RM in the next heartbeat. Regarding enforcing minimum, configuration properties are scheduler specific, so the minimum should have to come to the NM from the RM as part of the registration response. For increasing the AM must go to the RM first o avoid the race conditions already mentioned. To reduce the changes in the RM to a minimum I was thinking the following approach: * AM does a regular new allocation request with the desired delta capabilities increases with relaxedLocality=false (no changes on the AM-RM protocol/logic). * AM waits for the delta container allocation from the RM. * When AM receives the delta container allocation, using a new AM-NM API, it updates the original container with the delta container. * The NM makes the necessary corrections locally to the original container adding the capabilities o the delta container. * The NM notifies the RM to merge the original container with the delta container. * The RM updates the original container and drops the delta container. The complete list of changes for this approach would be: * AM-NM API ** decreaseContainer(ContainerId original, Resources) ** increateContainer(ContainerId original, ContainerId delta) * NM-RM API ** decreaseContainer(ContainerId original, Resources) ** registration() -> +minimumcontainersize ** mergeContainers(ContainerId originalKeep, ContainerId deltaDiscard) * NM logic * needs to correct capabilities enforcement for +/- delta * RM logic ** needs to update container resources when receiving a NM's decreaseContainer() call ** needs to update original container resources and delete delta container resources when receiving a NM's mergeContainer() call * RM scheduler API ** it should expose methods for decreaseContainer() and mergeContainers() functionality > Support changing resources of an allocated container > ---------------------------------------------------- > > Key: YARN-1197 > URL: https://issues.apache.org/jira/browse/YARN-1197 > Project: Hadoop YARN > Issue Type: Task > Components: api, nodemanager, resourcemanager > Affects Versions: 2.1.0-beta > Reporter: Wangda Tan > Attachments: yarn-1197.pdf > > > Currently, YARN cannot support merge several containers in one node to a big container, which can make us incrementally ask resources, merge them to a bigger one, and launch our processes. The user scenario is described in the comments. -- This message was sent by Atlassian JIRA (v6.1#6144)