Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 774021981F for ; Wed, 13 Apr 2016 16:00:28 +0000 (UTC) Received: (qmail 79390 invoked by uid 500); 13 Apr 2016 16:00:28 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 79338 invoked by uid 500); 13 Apr 2016 16:00:28 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 79299 invoked by uid 99); 13 Apr 2016 16:00:28 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 Apr 2016 16:00:28 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 0E3712C1F6F for ; Wed, 13 Apr 2016 16:00:28 +0000 (UTC) Date: Wed, 13 Apr 2016 16:00:28 +0000 (UTC) From: "Karthik Kambatla (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-2883) Queuing of container requests in the NM MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-2883?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15239499#comment-15239499 ] Karthik Kambatla commented on YARN-2883: ---------------------------------------- Discussed this offline with [~asuresh], [~kkaranasos] and [~chris.douglas]. Regarding queuing vs immediately starting guaranteed containers, it is reasonable to queue it as part of YARN-2883. In YARN-1011, we could add the option of starting them directly or using the actual utilization for determining resource availability. The logic sounds good to me. Most of my comments are cosmetic (readability/maintainability) in nature. Since this change will only be in trunk (and not branch-2), I am comfortable with checking this in to unblock other efforts. I am open to addressing some of my comments in a subsequent JIRA, and happy to contribute those changes too. Comments: # QueuingContainerManagerImpl ## I see that additions to opportContainersToKill are in the monitor. My latter comments recommend moving the helper methods in monitor to manager itself. If we don't do that, it would help to add a comment mentioning where this set is populated. ## Nit: Rename opportContainersToKill to opportunisticContainersToKill? ## Nits: In the constructor, don't need to specify the type of ConcurrentLinkedQueue ## Nit: The following code could be merged into a single line: {code} if (allocatedContInfo .getExecutionType() == ExecutionType.GUARANTEED) { {code} ## Nit: Rename killOpportContainers to killOpportunisticContainers ## Would it make sense to have a field for queuingContainerMonitor since it is used at several places? I am open to calling this containersMonitor depending on whether we choose to relax the visibility of the same name field in the parent class. # QueuingContainersMonitorImpl ## Several methods and fields seem to use utilization/usage in their names; this is misleading as it gives the impression we are looking at the utilization instead of allocation/limit. ## Would it make sense to track the aggregateContainerAllocation using ProcessTreeInfo instead of ResourceUtilization? The latter likely works better for YARN-1011, but I am fine with either. ## Would it make sense to track the aggregateContainerAllocation in ContainersMonitorImpl itself? This can be updated when we add/remove items from trackingContainers? That way, most of the helper methods in QueuingContainersMonitorImpl can just move to QueuingContainerManager, and the helper methods themselves will not need all the parameters they are passed making the code more readable. ## Nit: Rename opportContainersToKill to opportunisticContainersToKill and even more preferably to identifyOpportunisticContainersToKill? # ContainerImpl: The second change appears spurious. # ContainersMonitorImpl ## Visibility of some of the fields has been relaxed. Not all of them are required. Some of them are for tests; can we add @VisibleForTesting along with a comment about what the visibility could be if it weren't for the tests? ## Observation: The addition of onStart etc. methods is not necessary, but makes the code easier to understand. # Should the config being added be a queue length with a default of 0 instead of a boolean that we have now? I am fine with filing a follow-up to fix this up. My intent here is to limit the new configs we add and avoid redundancy. # TestQueuingContainerManager ## createContainerManager defines getRemoteUgi the exact same way as TestContainerManager. Any chance we can avoid the duplication? ## Would it make sense to define the right hasResources in the setup method itself? ## When creating a new ArrayList, don't need to specify the type. > Queuing of container requests in the NM > --------------------------------------- > > Key: YARN-2883 > URL: https://issues.apache.org/jira/browse/YARN-2883 > Project: Hadoop YARN > Issue Type: Sub-task > Components: nodemanager, resourcemanager > Reporter: Konstantinos Karanasos > Assignee: Konstantinos Karanasos > Attachments: YARN-2883-trunk.004.patch, YARN-2883-trunk.005.patch, YARN-2883-trunk.006.patch, YARN-2883-trunk.007.patch, YARN-2883-trunk.008.patch, YARN-2883-trunk.009.patch, YARN-2883-trunk.010.patch, YARN-2883-trunk.011.patch, YARN-2883-yarn-2877.001.patch, YARN-2883-yarn-2877.002.patch, YARN-2883-yarn-2877.003.patch, YARN-2883-yarn-2877.004.patch > > > We propose to add a queue in each NM, where queueable container requests can be held. > Based on the available resources in the node and the containers in the queue, the NM will decide when to allow the execution of a queued container. > In order to ensure the instantaneous start of a guaranteed-start container, the NM may decide to pre-empt/kill running queueable containers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)