Return-Path: X-Original-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Delivered-To: apmail-hadoop-hdfs-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 31FE81762C for ; Thu, 16 Apr 2015 18:04:12 +0000 (UTC) Received: (qmail 80374 invoked by uid 500); 16 Apr 2015 18:04:00 -0000 Delivered-To: apmail-hadoop-hdfs-user-archive@hadoop.apache.org Received: (qmail 80111 invoked by uid 500); 16 Apr 2015 18:03:59 -0000 Mailing-List: contact user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@hadoop.apache.org Delivered-To: mailing list user@hadoop.apache.org Received: (qmail 79195 invoked by uid 99); 16 Apr 2015 18:03:59 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Apr 2015 18:03:59 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of giorgioath@gmail.com designates 209.85.215.42 as permitted sender) Received: from [209.85.215.42] (HELO mail-la0-f42.google.com) (209.85.215.42) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 16 Apr 2015 18:03:55 +0000 Received: by laat2 with SMTP id t2so62846454laa.1 for ; Thu, 16 Apr 2015 11:03:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=y92ahIM3b6ZlaUoUhN84R8BqiGpG+5F83+nP1gSHweY=; b=fT/FNukLwq232vhVmiX80Mb7FlXz3lWmIT9ZMGDWo/QXEG5KC2zzvLRexhv3cDDnJ8 5wgw/pV4Du9/eN8ZSl9UyK09np44KNvizKTCyVYIkQBi46rjiSFyn3PU0g0iKTfPCVAW U+ZHaH48yscCXK//GeXMkLN6xZnG5qm8+j25f4iSSx6TbLvGBzIY0N12V0TrL4GV1IXF 7MhSr031cTVnX2cd1PeqRHI/Iz7AaSjzjL0GsBwseZXATfWUCc5QTU5bhvDcw1VJbZm8 1sOOEiHDwBodOHzjJ6DuyRBKsETmN+hG8sfJU0w1dZGawLcOUXZGTqG13NuDJObVPwmT 1X3w== X-Received: by 10.112.137.164 with SMTP id qj4mr29463295lbb.105.1429207414401; Thu, 16 Apr 2015 11:03:34 -0700 (PDT) MIME-Version: 1.0 Received: by 10.25.22.223 with HTTP; Thu, 16 Apr 2015 11:03:14 -0700 (PDT) In-Reply-To: References: <0EE80F6F7A98A64EBD18F2BE839C9115677ADE35@szxeml512-mbs.china.huawei.com> From: George Ioannidis Date: Thu, 16 Apr 2015 20:03:14 +0200 Message-ID: Subject: Re: Pin Map/Reduce tasks to specific cores To: user@hadoop.apache.org Content-Type: multipart/alternative; boundary=089e0115fd1c3eedfb0513db4897 X-Virus-Checked: Checked by ClamAV on apache.org --089e0115fd1c3eedfb0513db4897 Content-Type: text/plain; charset=UTF-8 Dear Rohith and Naga, Thank you very much for your quick responses, your information has proven very useful. Cheers, George On 7 April 2015 at 07:08, Naganarasimha G R (Naga) < garlanaganarasimha@huawei.com> wrote: > Hi George, > > The current implementation present in YARN using Cgroups supports CPU > isolation but not by pinning to specific cores (Cgroup CPUsets) but based > on cpu cycles (quota & Period). > Admin is provided with an option of specifying how much percentage of CPU > can be used by YARN containers. And Yarn will take care of configuring > Cgroup Quota and Period files and > ensures only configured CPU percentage is only used by YARN containers > > Is there any particular need to pin the MR tasks to the specific cores ? > or you just want to ensure YARN is not using more than the specified > percentage of CPU in a give node ? > > Regards, > Naga > > ------------------------------ > *From:* Rohith Sharma K S [rohithsharmaks@huawei.com] > *Sent:* Tuesday, April 07, 2015 09:23 > *To:* user@hadoop.apache.org > *Subject:* RE: Pin Map/Reduce tasks to specific cores > > Hi George > > > > In MRV2, YARN supports CGroups implementation. Using CGroup it is > possible to run containers in specific cores. > > > > For your detailed reference, some of the useful links > > > http://dev.hortonworks.com.s3.amazonaws.com/HDPDocuments/HDP2/HDP-2-trunk/bk_system-admin-guide/content/ch_cgroups.html > > > http://blog.cloudera.com/blog/2013/12/managing-multiple-resources-in-hadoop-2-with-yarn/ > > http://riccomini.name/posts/hadoop/2013-06-14-yarn-with-cgroups/ > > > > P.S : I could not find any related document in Hadoop Yarn docs. I will > raise ticket for the same in community. > > > > Hope the above information will help your use case!!! > > > > Thanks & Regards > > Rohith Sharma K S > > > > *From:* George Ioannidis [mailto:giorgioath@gmail.com] > *Sent:* 07 April 2015 01:55 > *To:* user@hadoop.apache.org > *Subject:* Pin Map/Reduce tasks to specific cores > > > > Hello. My question, which can be found on *Stack Overflow > * > as well, regards pinning map/reduce tasks to specific cores, either on > hadoop v.1.2.1 or hadoop v.2. > > In specific, I would like to know if the end-user can have any control on > which core executes a specific map/reduce task. > > To pin an application on linux, there's the "taskset" command, but is > anything similar provided by hadoop? If not, is the Linux Scheduler in > charge of allocating tasks to specific cores? > > > > ------------------ > > Below I am providing two cases to better illustrate my question: > > *Case #1:* 2 GiB input size, HDFS block size of 64 MiB and 2 compute > nodes available, with 32 cores each. > > As follows, 32 map tasks will be called; let's suppose that mapred.tasktracker.map.tasks.maximum > = 16, so 16 map tasks will be allocated to each node. > > Can I guarantee that each Map Task will run on a specific core, or is it > up to the Linux Scheduler? > > ------------------ > > *Case #2:* The same as case #1, but now the input size is 8 GiB, so there > are not enough slots for all map tasks (128), so multiple tasks will share > the same cores. > > Can I control how much "time" each task will spend on a specific core and > if it will be reassigned to the same core in the future? > > Any information on the above would be highly appreciated. > > Kind Regards, > > George > --089e0115fd1c3eedfb0513db4897 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Dear Rohith and Naga,

Thank you very mu= ch for your quick responses, your information has proven very useful.
=

Cheers,
George

On 7 April 2015 at 07:08, Naganarasi= mha G R (Naga) <garlanaganarasimha@huawei.com> w= rote:
Hi George,

The current implementation present in YARN using Cgroups supports CPU = isolation but not by pinning to specific cores (Cgroup CPUsets) but based o= n cpu cycles (quota & Period).
Admin is provided with an option of specifying how much percentage of = CPU can be used by YARN containers. And Yarn will take care of configuring = Cgroup Quota and Period files and=C2=A0
ensures only configured CPU percentage is only used by YARN containers=

Is there any particular need to pin the MR tasks to the specific cores= ? or you just want to ensure YARN is not using more than the specified per= centage of CPU in a give node ?

Regards,
Naga


From: Rohith Sharma K S [rohithsharmaks@huawei.com]
Sent: Tuesday, April 07, 2015 09:23
To: user= @hadoop.apache.org
Subject: RE: Pin Map/Reduce tasks to specific cores

Hi George

=C2=A0

In MRV2, YARN supports CG= roups implementation.=C2=A0 Using CGroup it is possible to run containers i= n specific cores.

=C2=A0

For your detailed reference, some of the useful link= s

http://dev.hortonworks.com.s3.amazonaws.com/HDPDocum= ents/HDP2/HDP-2-trunk/bk_system-admin-guide/content/ch_cgroups.html

http://b= log.cloudera.com/blog/2013/12/managing-multiple-resources-in-hadoop-2-with-= yarn/

http://riccomini.name/posts/hado= op/2013-06-14-yarn-with-cgroups/

=C2=A0

P.S : I could not find any related document in Hadoo= p Yarn docs. I will raise ticket for the same=C2=A0 in community.

=C2=A0

Hope the above informatio= n will help your use case!!!

=C2=A0

Thanks & Regards

Rohith Sharma K S<= /p>

=C2=A0

From: George I= oannidis [mailto:= giorgioath@gmail.com]
Sent: 07 April 2015 01:55
To: user= @hadoop.apache.org
Subject: Pin Map/Reduce tasks to specific cores

=C2=A0

Hello. My question, which can be found on Stack Overflow as well, regards pinnin= g map/reduce tasks to specific cores, either on hadoop v.1.2.1 or hadoop v.2.

In specific, I would like to know if the end-user ca= n have any control on which core executes a specific map/reduce task.

To pin an application on linux, there's the "taskset" command= , but is anything similar provided by hadoop? If not, is the Linux Schedule= r in charge of allocating tasks to specific cores?

=C2=A0

------------------

Below I am providing = two cases to better illustrate my question:

Case #1: 2 GiB input size, HDFS block size of= 64 MiB and 2 compute nodes available, with 32 cores each.

As follows, 32 map ta= sks will be called; let's suppose that mapred.tasktracker.map.tasks.maximum= =3D 16, so 16 map tasks will be allocated to each node.

Can I guarantee that each Map Task will run on a spe= cific core, or is it up to the Linux Scheduler?

------------------

Case #2: The same as case #1, but now the input size is 8 GiB, so th= ere are not enough slots for all map tasks (128), so multiple tasks will sh= are the same cores.

Can I control how muc= h "time" each task will spend on a specific core and if it will b= e reassigned to the same core in the future?

Any information on th= e above would be highly appreciated.

Kind Regards,

George


--089e0115fd1c3eedfb0513db4897--