Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BCCD81843B for ; Fri, 9 Oct 2015 01:47:27 +0000 (UTC) Received: (qmail 6950 invoked by uid 500); 9 Oct 2015 01:47:27 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 6907 invoked by uid 500); 9 Oct 2015 01:47:27 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 6856 invoked by uid 99); 9 Oct 2015 01:47:27 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 09 Oct 2015 01:47:27 +0000 Date: Fri, 9 Oct 2015 01:47:27 +0000 (UTC) From: "Bibin A Chundatt (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (YARN-4140) RM container allocation delayed incase of app submitted to Nodelabel partition MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-4140?page=3Dcom.atlassian.= jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D14949= 747#comment-14949747 ]=20 Bibin A Chundatt commented on YARN-4140: ---------------------------------------- Hi [~leftnoteasy] Thanks for looking into it.Release audit warning not related to current pat= ch {noformat} /home/jenkins/jenkins-slave/workspace/PreCommit-YARN-Build/hadoop-hdfs-proj= ect/hadoop-hdfs-native-client/src/main/native/fuse-dfs/util/tree.h Lines that start with ????? in the release audit report indicate files tha= t do not have an Apache license header. {noformat} > RM container allocation delayed incase of app submitted to Nodelabel part= ition > -------------------------------------------------------------------------= ----- > > Key: YARN-4140 > URL: https://issues.apache.org/jira/browse/YARN-4140 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, client, resourcemanager > Reporter: Bibin A Chundatt > Assignee: Bibin A Chundatt > Attachments: 0001-YARN-4140.patch, 0002-YARN-4140.patch, 0003-YAR= N-4140.patch, 0004-YARN-4140.patch, 0005-YARN-4140.patch, 0006-YARN-4140.pa= tch, 0007-YARN-4140.patch, 0008-YARN-4140.patch, 0009-YARN-4140.patch, 0010= -YARN-4140.patch, 0011-YARN-4140.patch, 0012-YARN-4140.patch, 0013-YARN-414= 0.patch, 0014-YARN-4140.patch > > > Trying to run application on Nodelabel partition I found that the applic= ation execution time is delayed by 5 =E2=80=93 10 min for 500 containers . = Total 3 machines 2 machines were in same partition and app submitted to sam= e. > After enabling debug was able to find the below > # From AM the container ask is for OFF-SWITCH > # RM allocating all containers to NODE_LOCAL as shown in logs below. > # So since I was having about 500 containers time taken was about =E2=80= =93 6 minutes to allocate 1st map after AM allocation. > # Tested with about 1K maps using PI job took 17 minutes to allocate nex= t container after AM allocation > Once 500 container allocation on NODE_LOCAL is done the next container al= location is done on OFF_SWITCH > {code} > 2015-09-09 15:21:58,954 DEBUG org.apache.hadoop.yarn.server.resourcemanag= er.scheduler.SchedulerApplicationAttempt: showRequests: application=3Dappli= cation_1441791998224_0001 request=3D{Priority: 20, Capability: , # Containers: 500, Location: /default-rack, Relax Locality: tru= e, Node Label Expression: } > 2015-09-09 15:21:58,954 DEBUG org.apache.hadoop.yarn.server.resourcemanag= er.scheduler.SchedulerApplicationAttempt: showRequests: application=3Dappli= cation_1441791998224_0001 request=3D{Priority: 20, Capability: , # Containers: 500, Location: *, Relax Locality: true, Node Labe= l Expression: 3} > 2015-09-09 15:21:58,954 DEBUG org.apache.hadoop.yarn.server.resourcemanag= er.scheduler.SchedulerApplicationAttempt: showRequests: application=3Dappli= cation_1441791998224_0001 request=3D{Priority: 20, Capability: , # Containers: 500, Location: host-10-19-92-143, Relax Locality:= true, Node Label Expression: } > 2015-09-09 15:21:58,954 DEBUG org.apache.hadoop.yarn.server.resourcemanag= er.scheduler.SchedulerApplicationAttempt: showRequests: application=3Dappli= cation_1441791998224_0001 request=3D{Priority: 20, Capability: , # Containers: 500, Location: host-10-19-92-117, Relax Locality:= true, Node Label Expression: } > 2015-09-09 15:21:58,954 DEBUG org.apache.hadoop.yarn.server.resourcemanag= er.scheduler.capacity.ParentQueue: Assigned to queue: root.b.b1 stats: b1: = capacity=3D1.0, absoluteCapacity=3D0.5, usedResources=3D, usedCapacity=3D0.0, absoluteUsedCapacity=3D0.0, numApps=3D1, numContaine= rs=3D1 --> , NODE_LOCAL > {code} > =20 > {code} > 2015-09-09 14:35:45,467 DEBUG org.apache.hadoop.yarn.server.resourcemanag= er.scheduler.capacity.ParentQueue: Assigned to queue: root.b.b1 stats: b1: = capacity=3D1.0, absoluteCapacity=3D0.5, usedResources=3D, usedCapacity=3D0.0, absoluteUsedCapacity=3D0.0, numApps=3D1, numContaine= rs=3D1 --> , NODE_LOCAL > 2015-09-09 14:35:45,831 DEBUG org.apache.hadoop.yarn.server.resourcemanag= er.scheduler.capacity.ParentQueue: Assigned to queue: root.b.b1 stats: b1: = capacity=3D1.0, absoluteCapacity=3D0.5, usedResources=3D, usedCapacity=3D0.0, absoluteUsedCapacity=3D0.0, numApps=3D1, numContaine= rs=3D1 --> , NODE_LOCAL > 2015-09-09 14:35:46,469 DEBUG org.apache.hadoop.yarn.server.resourcemanag= er.scheduler.capacity.ParentQueue: Assigned to queue: root.b.b1 stats: b1: = capacity=3D1.0, absoluteCapacity=3D0.5, usedResources=3D, usedCapacity=3D0.0, absoluteUsedCapacity=3D0.0, numApps=3D1, numContaine= rs=3D1 --> , NODE_LOCAL > 2015-09-09 14:35:46,832 DEBUG org.apache.hadoop.yarn.server.resourcemanag= er.scheduler.capacity.ParentQueue: Assigned to queue: root.b.b1 stats: b1: = capacity=3D1.0, absoluteCapacity=3D0.5, usedResources=3D, usedCapacity=3D0.0, absoluteUsedCapacity=3D0.0, numApps=3D1, numContaine= rs=3D1 --> , NODE_LOCAL > {code} > {code} > dsperf@host-127:/opt/bibin/dsperf/HAINSTALL/install/hadoop/resourcemanage= r/logs1> cat hadoop-dsperf-resourcemanager-host-127.log | grep "NODE_LOCAL"= | grep "root.b.b1" | wc -l > 500 > {code} > =20 > (Consumes about 6 minutes) > =20 -- This message was sent by Atlassian JIRA (v6.3.4#6332)