Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CC5BD176F7 for ; Thu, 3 Mar 2016 04:22:06 +0000 (UTC) Received: (qmail 20837 invoked by uid 500); 3 Mar 2016 04:22:03 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 20674 invoked by uid 500); 3 Mar 2016 04:22:02 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 20664 invoked by uid 99); 3 Mar 2016 04:22:02 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 03 Mar 2016 04:22:02 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 3A4761A11DB for ; Thu, 3 Mar 2016 04:22:02 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.649 X-Spam-Level: X-Spam-Status: No, score=-0.649 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=2, LOTS_OF_MONEY=0.001, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-0.329, SPF_PASS=-0.001] autolearn=disabled Received: from mx2-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id J5WwIRd8yjHb for ; Thu, 3 Mar 2016 04:21:55 +0000 (UTC) Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [119.145.14.66]) by mx2-lw-us.apache.org (ASF Mail Server at mx2-lw-us.apache.org) with ESMTPS id 074E75FB70 for ; Thu, 3 Mar 2016 04:21:53 +0000 (UTC) Received: from 172.24.1.47 (EHLO szxeml433-hub.china.huawei.com) ([172.24.1.47]) by szxrg03-dlp.huawei.com (MOS 4.4.3-GA FastPath queued) with ESMTP id BXD12555; Thu, 03 Mar 2016 12:21:36 +0800 (CST) Received: from SZXEML505-MBX.china.huawei.com ([169.254.1.153]) by szxeml433-hub.china.huawei.com ([10.82.67.210]) with mapi id 14.03.0235.001; Thu, 3 Mar 2016 12:21:21 +0800 From: "Naganarasimha G R (Naga)" To: Marcin Tustin , "user@hadoop.apache.org" Subject: RE: Yarn with CapacityScheduler will only schedule two applications - open to consultants today Thread-Topic: Yarn with CapacityScheduler will only schedule two applications - open to consultants today Thread-Index: AQHRdJFnBgomtpLb8k6G7yK8ZCypuZ9HHYGs Date: Thu, 3 Mar 2016 04:21:20 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.18.250.107] Content-Type: multipart/alternative; boundary="_000_AD354F56741A1B47882A625909A59C692BE74A7ESZXEML505MBXchi_" MIME-Version: 1.0 X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A090201.56D7BBD4.000E,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=169.254.1.153, so=2013-05-26 15:14:31, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 749441582fd3ffb62105fbfd2c3578df --_000_AD354F56741A1B47882A625909A59C692BE74A7ESZXEML505MBXchi_ Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Hi Marcin, we're seeing is that even though we have 12 machines in our yarn cluster, y= arn will only schedule two applications. If this is the problem you are facing with the below configuration then its= mostly YARN-3216 itself. And i think its not yet out in any released hadoop version yet (not sure ab= out HDP) also not aware of the version 2.7.1.2.3 (may be HDP version numbe= r) IIUC you would have tried submitting in 2 diff partitions "data" and "yarn"= hence 2 if not for a given partition only one would have run (if the am re= source is less than the mimium size CS allows atleast 1 am to run) Given that you have already identified the issue, what more are you expecti= ng ? Regards, + Naga ________________________________ From: Marcin Tustin [mtustin@handybook.com] Sent: Wednesday, March 02, 2016 20:09 To: user@hadoop.apache.org Subject: Yarn with CapacityScheduler will only schedule two applications - = open to consultants today Hi All, We're hitting this issue. If you're a consultant with capacity today (2 Mar= ch 2016 EST in New York) please feel free to contact me on or off list. In terms of stack, we're using Yarn 2.7.1.2.3 from the latest Hortonworks d= istribution. It's possible we're hitting this bug: https://issues.apache.or= g/jira/browse/YARN-3216 The behaviour we're seeing is that even though we have 12 machines in our y= arn cluster, yarn will only schedule two applications. The breakdown of mac= hines is: 10 machines with the data label 2 with the yarn label We have three queues: interactive, noninteractive, and default. We're expecting to split capacity in the data label 20%/80% and default 100= % of yarn label. We have the following capacity scheduler config (Key=3Dvalue format taken f= rom ambari): yarn.scheduler.capacity.maximum-am-resource-percent=3D100 yarn.scheduler.capacity.maximum-applications=3D10000 yarn.scheduler.capacity.node-locality-delay=3D40 yarn.scheduler.capacity.queue-mappings-override.enable=3Dtrue yarn.scheduler.capacity.root.accessible-node-labels=3D* yarn.scheduler.capacity.root.accessible-node-labels.data.capacity=3D100 yarn.scheduler.capacity.root.accessible-node-labels.data.maximum-capacity= =3D100 yarn.scheduler.capacity.root.accessible-node-labels.yarn.capacity=3D100 yarn.scheduler.capacity.root.accessible-node-labels.yarn.maximum-capacity= =3D100 yarn.scheduler.capacity.root.acl_administer_queue=3D* yarn.scheduler.capacity.root.capacity=3D100 yarn.scheduler.capacity.root.default-node-label-expression=3Ddata yarn.scheduler.capacity.root.default.accessible-node-labels=3Dyarn yarn.scheduler.capacity.root.default.accessible-node-labels.yarn.capacity= =3D100 yarn.scheduler.capacity.root.default.accessible-node-labels.yarn.maximum-ca= pacity=3D100 yarn.scheduler.capacity.root.default.acl_submit_applications=3D* yarn.scheduler.capacity.root.default.capacity=3D50 yarn.scheduler.capacity.root.default.default-node-label-expression=3Dyarn yarn.scheduler.capacity.root.default.maximum-am-resource-percent=3D80 yarn.scheduler.capacity.root.default.maximum-capacity=3D100 yarn.scheduler.capacity.root.default.minimum-user-limit-percent=3D100 yarn.scheduler.capacity.root.default.ordering-policy=3Dfair yarn.scheduler.capacity.root.default.ordering-policy.fair.enable-size-based= -weight=3Dfalse yarn.scheduler.capacity.root.default.state=3DRUNNING yarn.scheduler.capacity.root.default.user-limit-factor=3D100 yarn.scheduler.capacity.root.interactive.accessible-node-labels=3Ddata yarn.scheduler.capacity.root.interactive.accessible-node-labels.data.capaci= ty=3D20 yarn.scheduler.capacity.root.interactive.accessible-node-labels.data.maximu= m-capacity=3D100 yarn.scheduler.capacity.root.interactive.acl_administer_queue=3D* yarn.scheduler.capacity.root.interactive.acl_submit_applications=3D* yarn.scheduler.capacity.root.interactive.capacity=3D10 yarn.scheduler.capacity.root.interactive.maximum-am-resource-percent=3D50 yarn.scheduler.capacity.root.interactive.maximum-applications=3D2000 yarn.scheduler.capacity.root.interactive.maximum-capacity=3D100 yarn.scheduler.capacity.root.interactive.minimum-user-limit-percent=3D100 yarn.scheduler.capacity.root.interactive.ordering-policy=3Dfifo yarn.scheduler.capacity.root.interactive.state=3DRUNNING yarn.scheduler.capacity.root.interactive.user-limit-factor=3D100 yarn.scheduler.capacity.root.maximum-am-resource-percent=3D80 yarn.scheduler.capacity.root.maximum-capacity=3D100 yarn.scheduler.capacity.root.noninteractive.accessible-node-labels=3Ddata yarn.scheduler.capacity.root.noninteractive.accessible-node-labels.data.cap= acity=3D80 yarn.scheduler.capacity.root.noninteractive.accessible-node-labels.data.max= imum-am-resource-percent=3D80 yarn.scheduler.capacity.root.noninteractive.accessible-node-labels.data.max= imum-capacity=3D80 yarn.scheduler.capacity.root.noninteractive.acl_submit_applications=3D* yarn.scheduler.capacity.root.noninteractive.capacity=3D40 yarn.scheduler.capacity.root.noninteractive.default-node-label-expression= =3Ddata yarn.scheduler.capacity.root.noninteractive.maximum-am-resource-percent=3D8= 0 yarn.scheduler.capacity.root.noninteractive.maximum-applications=3D8000 yarn.scheduler.capacity.root.noninteractive.maximum-capacity=3D100 yarn.scheduler.capacity.root.noninteractive.minimum-user-limit-percent=3D10= 0 yarn.scheduler.capacity.root.noninteractive.ordering-policy=3Dfair yarn.scheduler.capacity.root.noninteractive.ordering-policy.fair.enable-siz= e-based-weight=3Dfalse yarn.scheduler.capacity.root.noninteractive.state=3DRUNNING yarn.scheduler.capacity.root.noninteractive.user-limit-factor=3D100 yarn.scheduler.capacity.root.queues=3Ddefault,interactive,noninteractive yarn.scheduler.capacity.root.user-limit-factor=3D40 Also yarn.resourcemanager.scheduler.class is org.apache.hadoop.yarn.server.= resourcemanager.scheduler.capacity.CapacityScheduler. Any suggestions gratefully received. Marcin Want to work at Handy? Check out our culture deck and open roles Latest news at Handy Handy just raised $50m led by Fidelity [http://marketing-email-assets.handybook.com/smalllogo.png] --_000_AD354F56741A1B47882A625909A59C692BE74A7ESZXEML505MBXchi_ Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
Hi Marcin,

we= 're seeing is that even though we have 12 machines in our yarn cluster, yar= n will only schedule two applications. 
If thi= s is the problem you are facing with the below configuration then its mostl= y YARN-3216 itself.&= nbsp;
And i think its not yet out in any released hadoop version yet (not su= re about HDP) also not aware of the version  = 2.7.1.2.3 (may be HDP version number)
IIUC you would have tried submitting i= n 2 diff partitions "data" and "yarn" hence 2 if not for a given partition only one would have = run (if the am resource is less than the mimium size CS allows atleast 1 am= to run)

Given= that you have already identified the issue, what more are you expecting ?<= /span>

Regard= s,
+ = Naga
From: Marcin Tustin [mtustin@handybook.com= ]
Sent: Wednesday, March 02, 2016 20:09
To: user@hadoop.apache.org
Subject: Yarn with CapacityScheduler will only schedule two applicat= ions - open to consultants today

Hi All,

We're hitting this issue. If you're a consultant with capacity today (= 2 March 2016 EST in New York) please feel free to contact me on or off list= .

In terms of stack, we're using Yarn 2= .7.1.2.3 from the latest Hortonworks distribution. It's possible we're hitt= ing this bug: https://issues.apache.org/jira/browse/Y= ARN-3216

The behaviour we're seeing is that even though we have 12 machines in = our yarn cluster, yarn will only schedule two applications. The breakdown o= f machines is:
10 machines with the data label
2 with the yarn label

We have three queues: interactive, noninteractive, and default. <= /div>
We're expecting to split capacity in the data label 20%/80% and defaul= t 100% of yarn label.

We have the following capacity scheduler config (Key=3Dvalue format ta= ken from ambari):

yarn.scheduler.capacity.maximum-am-resource-percent=3D100
yarn.scheduler.capacity.maximum-applications=3D10000
yarn.scheduler.capacity.node-locality-delay=3D40
yarn.scheduler.capacity.queue-mappings-override.enable=3Dtrue
yarn.scheduler.capacity.root.accessible-node-labels=3D*
yarn.scheduler.capacity.root.accessible-node-labels.data.capacity=3D10= 0
yarn.scheduler.capacity.root.accessible-node-labels.data.maximum-capac= ity=3D100
yarn.scheduler.capacity.root.accessible-node-labels.yarn.capacity=3D10= 0
yarn.scheduler.capacity.root.accessible-node-labels.yarn.maximum-capac= ity=3D100
yarn.scheduler.capacity.root.acl_administer_queue=3D*
yarn.scheduler.capacity.root.capacity=3D100
yarn.scheduler.capacity.root.default-node-label-expression=3Ddata
yarn.scheduler.capacity.root.default.accessible-node-labels=3Dyarn
yarn.scheduler.capacity.root.default.accessible-node-labels.yarn.capac= ity=3D100
yarn.scheduler.capacity.root.default.accessible-node-labels.yarn.maxim= um-capacity=3D100
yarn.scheduler.capacity.root.default.acl_submit_applications=3D*
yarn.scheduler.capacity.root.default.capacity=3D50
yarn.scheduler.capacity.root.default.default-node-label-expression=3Dy= arn
yarn.scheduler.capacity.root.default.maximum-am-resource-percent=3D80<= /div>
yarn.scheduler.capacity.root.default.maximum-capacity=3D100
yarn.scheduler.capacity.root.default.minimum-user-limit-percent=3D100<= /div>
yarn.scheduler.capacity.root.default.ordering-policy=3Dfair
yarn.scheduler.capacity.root.default.ordering-policy.fair.enable-size-= based-weight=3Dfalse
yarn.scheduler.capacity.root.default.state=3DRUNNING
yarn.scheduler.capacity.root.default.user-limit-factor=3D100
yarn.scheduler.capacity.root.interactive.accessible-node-labels=3Ddata=
yarn.scheduler.capacity.root.interactive.accessible-node-labels.data.c= apacity=3D20
yarn.scheduler.capacity.root.interactive.accessible-node-labels.data.m= aximum-capacity=3D100
yarn.scheduler.capacity.root.interactive.acl_administer_queue=3D*
yarn.scheduler.capacity.root.interactive.acl_submit_applications=3D*
yarn.scheduler.capacity.root.interactive.capacity=3D10
yarn.scheduler.capacity.root.interactive.maximum-am-resource-percent= =3D50
yarn.scheduler.capacity.root.interactive.maximum-applications=3D2000
yarn.scheduler.capacity.root.interactive.maximum-capacity=3D100
yarn.scheduler.capacity.root.interactive.minimum-user-limit-percent=3D= 100
yarn.scheduler.capacity.root.interactive.ordering-policy=3Dfifo
yarn.scheduler.capacity.root.interactive.state=3DRUNNING
yarn.scheduler.capacity.root.interactive.user-limit-factor=3D100
yarn.scheduler.capacity.root.maximum-am-resource-percent=3D80
yarn.scheduler.capacity.root.maximum-capacity=3D100
yarn.scheduler.capacity.root.noninteractive.accessible-node-labels=3Dd= ata
yarn.scheduler.capacity.root.noninteractive.accessible-node-labels.dat= a.capacity=3D80
yarn.scheduler.capacity.root.noninteractive.accessible-node-labels.dat= a.maximum-am-resource-percent=3D80
yarn.scheduler.capacity.root.noninteractive.accessible-node-labels.dat= a.maximum-capacity=3D80
yarn.scheduler.capacity.root.noninteractive.acl_submit_applications=3D= *
yarn.scheduler.capacity.root.noninteractive.capacity=3D40
yarn.scheduler.capacity.root.noninteractive.default-node-label-express= ion=3Ddata
yarn.scheduler.capacity.root.noninteractive.maximum-am-resource-percen= t=3D80
yarn.scheduler.capacity.root.noninteractive.maximum-applications=3D800= 0
yarn.scheduler.capacity.root.noninteractive.maximum-capacity=3D100
yarn.scheduler.capacity.root.noninteractive.minimum-user-limit-percent= =3D100
yarn.scheduler.capacity.root.noninteractive.ordering-policy=3Dfair
yarn.scheduler.capacity.root.noninteractive.ordering-policy.fair.enabl= e-size-based-weight=3Dfalse
yarn.scheduler.capacity.root.noninteractive.state=3DRUNNING
yarn.scheduler.capacity.root.noninteractive.user-limit-factor=3D100
yarn.scheduler.capacity.root.queues=3Ddefault,interactive,noninteracti= ve
yarn.scheduler.capacity.root.user-limit-factor=3D40


Also yarn.resourcemanager.scheduler.class is org.apa= che.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler= .

Any suggestions gr= atefully received.

Marcin

<= span style=3D"color:rgb(34,34,34)">Want to work at Handy? Check out our&nbs= p;culture deck and open roles
Latest ne= ws at Handy
Handy just = raised $50m led by Fidelity

--_000_AD354F56741A1B47882A625909A59C692BE74A7ESZXEML505MBXchi_--