Return-Path: X-Original-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Delivered-To: apmail-hadoop-yarn-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5879117D0F for ; Mon, 14 Sep 2015 05:57:46 +0000 (UTC) Received: (qmail 92559 invoked by uid 500); 14 Sep 2015 05:57:46 -0000 Delivered-To: apmail-hadoop-yarn-issues-archive@hadoop.apache.org Received: (qmail 92512 invoked by uid 500); 14 Sep 2015 05:57:46 -0000 Mailing-List: contact yarn-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: yarn-issues@hadoop.apache.org Delivered-To: mailing list yarn-issues@hadoop.apache.org Received: (qmail 92501 invoked by uid 99); 14 Sep 2015 05:57:46 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 14 Sep 2015 05:57:46 +0000 Date: Mon, 14 Sep 2015 05:57:46 +0000 (UTC) From: "Bibin A Chundatt (JIRA)" To: yarn-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (YARN-4140) RM container allocation delayed incase of app submitted to Nodelabel partition MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/YARN-4140?page=3Dcom.atlassian= .jira.plugin.system.issuetabpanels:all-tabpanel ] Bibin A Chundatt updated YARN-4140: ----------------------------------- Attachment: 0002-YARN-4140.patch > RM container allocation delayed incase of app submitted to Nodelabel part= ition > -------------------------------------------------------------------------= ----- > > Key: YARN-4140 > URL: https://issues.apache.org/jira/browse/YARN-4140 > Project: Hadoop YARN > Issue Type: Sub-task > Components: api, client, resourcemanager > Reporter: Bibin A Chundatt > Assignee: Bibin A Chundatt > Attachments: 0001-YARN-4140.patch, 0002-YARN-4140.patch > > > Trying to run application on Nodelabel partition I found that the applic= ation execution time is delayed by 5 =E2=80=93 10 min for 500 containers . = Total 3 machines 2 machines were in same partition and app submitted to sam= e. > After enabling debug was able to find the below > # From AM the container ask is for OFF-SWITCH > # RM allocating all containers to NODE_LOCAL as shown in logs below. > # So since I was having about 500 containers time taken was about =E2=80= =93 6 minutes to allocate 1st map after AM allocation. > # Tested with about 1K maps using PI job took 17 minutes to allocate nex= t container after AM allocation > Once 500 container allocation on NODE_LOCAL is done the next container al= location is done on OFF_SWITCH > {code} > 2015-09-09 15:21:58,954 DEBUG org.apache.hadoop.yarn.server.resourcemanag= er.scheduler.SchedulerApplicationAttempt: showRequests: application=3Dappli= cation_1441791998224_0001 request=3D{Priority: 20, Capability: , # Containers: 500, Location: /default-rack, Relax Locality: tru= e, Node Label Expression: } > 2015-09-09 15:21:58,954 DEBUG org.apache.hadoop.yarn.server.resourcemanag= er.scheduler.SchedulerApplicationAttempt: showRequests: application=3Dappli= cation_1441791998224_0001 request=3D{Priority: 20, Capability: , # Containers: 500, Location: *, Relax Locality: true, Node Labe= l Expression: 3} > 2015-09-09 15:21:58,954 DEBUG org.apache.hadoop.yarn.server.resourcemanag= er.scheduler.SchedulerApplicationAttempt: showRequests: application=3Dappli= cation_1441791998224_0001 request=3D{Priority: 20, Capability: , # Containers: 500, Location: host-10-19-92-143, Relax Locality:= true, Node Label Expression: } > 2015-09-09 15:21:58,954 DEBUG org.apache.hadoop.yarn.server.resourcemanag= er.scheduler.SchedulerApplicationAttempt: showRequests: application=3Dappli= cation_1441791998224_0001 request=3D{Priority: 20, Capability: , # Containers: 500, Location: host-10-19-92-117, Relax Locality:= true, Node Label Expression: } > 2015-09-09 15:21:58,954 DEBUG org.apache.hadoop.yarn.server.resourcemanag= er.scheduler.capacity.ParentQueue: Assigned to queue: root.b.b1 stats: b1: = capacity=3D1.0, absoluteCapacity=3D0.5, usedResources=3D, usedCapacity=3D0.0, absoluteUsedCapacity=3D0.0, numApps=3D1, numContaine= rs=3D1 --> , NODE_LOCAL > {code} > =20 > {code} > 2015-09-09 14:35:45,467 DEBUG org.apache.hadoop.yarn.server.resourcemanag= er.scheduler.capacity.ParentQueue: Assigned to queue: root.b.b1 stats: b1: = capacity=3D1.0, absoluteCapacity=3D0.5, usedResources=3D, usedCapacity=3D0.0, absoluteUsedCapacity=3D0.0, numApps=3D1, numContaine= rs=3D1 --> , NODE_LOCAL > 2015-09-09 14:35:45,831 DEBUG org.apache.hadoop.yarn.server.resourcemanag= er.scheduler.capacity.ParentQueue: Assigned to queue: root.b.b1 stats: b1: = capacity=3D1.0, absoluteCapacity=3D0.5, usedResources=3D, usedCapacity=3D0.0, absoluteUsedCapacity=3D0.0, numApps=3D1, numContaine= rs=3D1 --> , NODE_LOCAL > 2015-09-09 14:35:46,469 DEBUG org.apache.hadoop.yarn.server.resourcemanag= er.scheduler.capacity.ParentQueue: Assigned to queue: root.b.b1 stats: b1: = capacity=3D1.0, absoluteCapacity=3D0.5, usedResources=3D, usedCapacity=3D0.0, absoluteUsedCapacity=3D0.0, numApps=3D1, numContaine= rs=3D1 --> , NODE_LOCAL > 2015-09-09 14:35:46,832 DEBUG org.apache.hadoop.yarn.server.resourcemanag= er.scheduler.capacity.ParentQueue: Assigned to queue: root.b.b1 stats: b1: = capacity=3D1.0, absoluteCapacity=3D0.5, usedResources=3D, usedCapacity=3D0.0, absoluteUsedCapacity=3D0.0, numApps=3D1, numContaine= rs=3D1 --> , NODE_LOCAL > {code} > {code} > dsperf@host-127:/opt/bibin/dsperf/HAINSTALL/install/hadoop/resourcemanage= r/logs1> cat hadoop-dsperf-resourcemanager-host-127.log | grep "NODE_LOCAL"= | grep "root.b.b1" | wc -l > 500 > {code} > =20 > (Consumes about 6 minutes) > =20 -- This message was sent by Atlassian JIRA (v6.3.4#6332)