Return-Path: X-Original-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-mapreduce-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 9D6C01767E for ; Wed, 5 Nov 2014 12:40:43 +0000 (UTC) Received: (qmail 18429 invoked by uid 500); 5 Nov 2014 12:40:37 -0000 Delivered-To: apmail-hadoop-mapreduce-user-archive@hadoop.apache.org Received: (qmail 18327 invoked by uid 500); 5 Nov 2014 12:40:37 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 18207 invoked by uid 99); 5 Nov 2014 12:40:32 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Nov 2014 12:40:32 +0000 X-ASF-Spam-Status: No, hits=2.2 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_NONE,SPF_HELO_PASS,SPF_PASS,T_FILL_THIS_FORM_SHORT X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of smita.deshpande@cumulus-systems.com designates 157.56.111.101 as permitted sender) Received: from [157.56.111.101] (HELO na01-bn1-obe.outbound.protection.outlook.com) (157.56.111.101) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 05 Nov 2014 12:40:27 +0000 Received: from CO1PR01MB110.prod.exchangelabs.com (10.242.164.147) by CO1PR01MB109.prod.exchangelabs.com (10.242.164.146) with Microsoft SMTP Server (TLS) id 15.1.11.14; Wed, 5 Nov 2014 12:40:03 +0000 Received: from CO1PR01MB110.prod.exchangelabs.com ([169.254.11.43]) by CO1PR01MB110.prod.exchangelabs.com ([169.254.11.43]) with mapi id 15.01.0011.000; Wed, 5 Nov 2014 12:40:03 +0000 From: Smita Deshpande To: "user@hadoop.apache.org" Subject: RE: CPU usage of a container. Thread-Topic: CPU usage of a container. Thread-Index: Ac/4tqBYYJgjJTU0QhyMspgTv29hVQABdyl7AA2KWcA= Date: Wed, 5 Nov 2014 12:40:02 +0000 Message-ID: References: <34da9ce40dfd4ab699621529c14f1221@CO1PR01MB110.prod.exchangelabs.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [14.140.122.178] x-microsoft-antispam: BCL:0;PCL:0;RULEID:;SRVR:CO1PR01MB109; x-forefront-prvs: 0386B406AA x-forefront-antispam-report: SFV:NSPM;SFS:(10019020)(252514010)(377454003)(199003)(189002)(53754006)(16236675004)(15395725005)(74316001)(16601075003)(19580395003)(86362001)(15202345003)(54356999)(76176999)(21056001)(50986999)(19580405001)(2501002)(4396001)(97736003)(122556002)(40100003)(15975445006)(46102003)(106356001)(31966008)(66066001)(450100001)(64706001)(2656002)(110136001)(20776003)(95666004)(2351001)(87936001)(99396003)(107046002)(120916001)(19300405004)(62966003)(105586002)(92566001)(19625215002)(19617315012)(101416001)(77156002)(77096003)(107886001)(33646002)(108616004)(24736002);DIR:OUT;SFP:1102;SCL:1;SRVR:CO1PR01MB109;H:CO1PR01MB110.prod.exchangelabs.com;FPR:;MLV:sfv;PTR:InfoNoRecords;A:1;MX:1;LANG:en; Content-Type: multipart/alternative; boundary="_000_d9987528d5a84dd9ab15a17ad9184009CO1PR01MB110prodexchang_" MIME-Version: 1.0 X-OriginatorOrg: cumulus-systems.com X-Virus-Checked: Checked by ClamAV on apache.org --_000_d9987528d5a84dd9ab15a17ad9184009CO1PR01MB110prodexchang_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi Naganarasimha, Thank you so much for your help. 1. We are using Hadoop 2.5.0 and it's against the trunk code. 2. It seems that we were not setting properties as you mentioned and = that is why it was taking more cpu share than which was assigned. I have tr= ied to setup YARN to use cgroup. While doing so I am facing some issues as = follows. 3. Do container-executor and container-executor.cfg files need root p= ermission? Because with other user it was throwing permission denied except= ion. And with root user I am getting invalid file container-executor.cfg fi= le exception in NodeManager log. Caused by: java.io.IOException: Linux container executor not configured pr= operly (error=3D24) Following is my container-executor.cfg file. yarn.nodemanager.linux-container-executor.group=3D banned.users=3D#comma separated list of users who can not run applications min.user.id=3D1000#Prevent other super-users allowed.system.users=3D##comma separated list of system users who CAN run a= pplications Am I missing some configuration related settings? Thanks again for writing. Regards, Smita From: Naganarasimha G R (Naga) [mailto:garlanaganarasimha@huawei.com] Sent: Wednesday, November 05, 2014 11:42 AM To: user@hadoop.apache.org Subject: RE: CPU usage of a container. Hi Smita, Can you please inform abt the following : 1. Which version of Hadoop ? 2. Linux Container Executor with DRC and "CgroupsLCEResourcesHandler" is be= ing configured ? 3. if its against the trunk code, have you configured for "yarn.nodemanager= .linux-container-executor.cgroups.strict-resource-usage" which is by defaul= t false? In general its not restrictive cpu usage, i.e. only when all the cpu cores = are used cgroups tries to restrict the container usage if not container is = allowed to use the cpu when its free Please refer comments from Chris Riccomini in https://issues.apache.org/jira/browse/YARN-600, will give some rough idea h= ow cpu isolation can be validated and also his blog http://riccomini.name/posts/hadoop/2013-06-14-yarn-with-cgroups which might help you in understanding cgroups and cpu isolation. After YARN-2531 "yarn.nodemanager.linux-container-executor.cgroups.strict-r= esource-usage" is supported so if you are using hadoop trunk code then you can restrict single container cpu usage. Regards, Naga Huawei Technologies Co., Ltd. Phone: Fax: Mobile: +91 9980040283 Email: naganarasimhagr@huawei.com Huawei Technologies Co., Ltd. Bantian, Longgang District,Shenzhen 518129, P.R.China http://www.huawei.com ________________________________ From: Smita Deshpande [smita.deshpande@cumulus-systems.com] Sent: Wednesday, November 05, 2014 13:21 To: user@hadoop.apache.org Subject: CPU usage of a container. Hi All, I am facing sort of a weird issue in YARN. I am running a s= ingle container on a cluster whose cpu configuration is as follows: NODEMANAGER1 : 4 cpu cores NODEMANAGER2 : 4 cpu cores NODEMANAGER3 : 16 cpu cores All processors are Hyperthreaded ones. So if I am using 1 c= pu core then max usage could be 200%. When I am running different number of threads in that conta= iner(basically cpu intensive calculation), its showing cpu usage more than = allotted number of cores to it. Please refer to below table for different t= est cases. Highlighted values in Red seem to have crossed its usage. I am u= sing DominantResourceCalculator in CS. PFA the screenshot for the same. Any help would be appreciated. Resource Ask %cpu Usage (from htop command) # of Threads launched in container <1024,1> 176.8 4 108 1 177 2 291 3 342 4 337 4 [container launched on NODEMANAGER3] <1024,2> 177 3 182.6 9 336 4 [container launched on NODEMANAGER3] 189 2 [container launched on NODEMANAGER2] 291 3 337 4 <1024,3> 283 3 329.7 9 343 4 [container launched on NODEMANAGER3] 122 1 216 2 290 3 <1024,4> 289 3 123 1 217 2 292 3 338 4 177.3 32 Regards, Smita --_000_d9987528d5a84dd9ab15a17ad9184009CO1PR01MB110prodexchang_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

Hi Naganarasimha,

   &nbs= p;            Thank = you so much for your help.

1.&n= bsp;      We are using H= adoop 2.5.0 and it’s against the trunk code.

2.&n= bsp;      It seems that = we were not setting properties as you mentioned and that is why it was taki= ng more cpu share than which was assigned. I have tried to setup YARN to us= e cgroup. While doing so I am facing some issues as follows.

3.&n= bsp;      Do container-e= xecutor and container-executor.cfg files need root permission? Because with= other user it was throwing permission denied exception. And with root user= I am getting invalid file container-executor.cfg file exception in NodeManager log.

 Caused by: java.io.IOException: Linux contain= er executor not configured properly (error=3D24)

 

Following is my container-executor.cfg file.

 

yarn.nodemanager.linux-container-executor.group=3D&= lt;user name with which I start nodemanager daemon>

banned.users=3D#comma separated list of users who c= an not run applications

min.user.id=3D1000#Prevent other super-users

allowed.system.users=3D##comma separated list of sy= stem users who CAN run applications

 

Am I missing some configuration related settings? T= hanks again for writing.

 

Regards,

Smita

 

From: Naganarasimha G R (Naga) [mailto:garlan= aganarasimha@huawei.com]
Sent: Wednesday, November 05, 2014 11:42 AM
To: user@hadoop.apache.org
Subject: RE: CPU usage of a container.

 

Hi Smita,

Can you please inform abt the following :=

1. Which version of Hadoop ?

2. Linux Container Executor with DRC= and "CgroupsLCEResourcesHandler" <= span style=3D"font-size:10.0pt;font-family:"Arial",sans-serif;col= or:black">is being configured ?

3. if its against the trunk code, have yo= u configured for "yarn.nodemanager.linux-container-executor.cgroups.strict-resource-= usage" which is by default false?

 

In general its not restrictive cpu usage,= i.e. only when all the cpu cores are used cgroups tries to restrict the co= ntainer usage if not container is allowed to use the cpu when its free

Please refer comments from Chris Riccomin= i in

https://issues.apache.org/jira/browse/YARN-600, will give some rough idea how cpu isolation can be validated and also his = blog

http://riccomini.name/posts/hadoop/20= 13-06-14-yarn-with-cgroups

which might help you in understanding cg= roups and cpu isolation.



After YARN-2531 "yarn.nodemanager.linux-container-executor.cgroups.strict-resource-= usage" is

supported so if you are using hadoop = ;trunk code

then you can restrict single container cp= u usage.



 

Regards,

Naga

 

Huawei Technologies Co., Ltd.
Phone:
Fax:
Mobile:  +91 9980040283
Email: naganarasimhagr@huawei= .com
Huawei Technologies Co., Ltd.
Bantian, Longgang District,Shenzhen 518129, P.R.China
http://www.huawei.com

 


From:<= /span> Smita Deshpande [smita.deshpande@cumulus-systems.com= ]
Sent: Wednesday, November 05, 2014 13:21
To: user@hadoop.apache.org=
Subject: CPU usage of a container.
=

Hi All,

    =             I am fac= ing sort of a weird issue in YARN. I am running a single container on a clu= ster whose cpu configuration is as follows:

    =             NODEMANA= GER1 : 4 cpu cores

    =             NODEMANA= GER2 : 4 cpu cores

    =             NODEMANA= GER3 : 16 cpu cores

    =             All proc= essors are Hyperthreaded ones. So if I am using 1 cpu core then max usage c= ould be 200%.

    =             When I a= m running different number of threads in that container(basically cpu inten= sive calculation), its showing cpu usage more than allotted number of cores= to it. Please refer to below table for different test cases. Highlighted values in Red seem to have crossed i= ts usage. I am using DominantResourceCalculator in CS.

    =             PFA the = screenshot for the same.

    =             Any help= would be appreciated.

 =

Resource Ask

%cpu Usage (from htop command)

# of Threads launched in container

<1024,1>

176.8

4

108

1

177

2

291

3

342

4

337

4    [container launched on NODEMANAG= ER3]

<1024,2>

177

3

182.6

9

336

4    [container launched on NODEMANAG= ER3]

189

2   [container launched on NODEMANAGER2]

291

3

337

4

<1024,3>

283

3

329.7

9

343

4  [container launched on NODEMANAGER3]

122

1

216

2

290

3

<1024,4>

289

3

123

1

217

2

292

3

338

4

177.3

32

 =

Regards,

Smita<= /p>

--_000_d9987528d5a84dd9ab15a17ad9184009CO1PR01MB110prodexchang_--