hadoop-common-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rohith Sharma K S <rohithsharm...@huawei.com>
Subject RE: Pin Map/Reduce tasks to specific cores
Date Tue, 07 Apr 2015 03:53:51 GMT
Hi George

In MRV2, YARN supports CGroups implementation.  Using CGroup it is possible to run containers
in specific cores.

For your detailed reference, some of the useful links
http://dev.hortonworks.com.s3.amazonaws.com/HDPDocuments/HDP2/HDP-2-trunk/bk_system-admin-guide/content/ch_cgroups.html
http://blog.cloudera.com/blog/2013/12/managing-multiple-resources-in-hadoop-2-with-yarn/
http://riccomini.name/posts/hadoop/2013-06-14-yarn-with-cgroups/

P.S : I could not find any related document in Hadoop Yarn docs. I will raise ticket for the
same  in community.

Hope the above information will help your use case!!!

Thanks & Regards
Rohith Sharma K S

From: George Ioannidis [mailto:giorgioath@gmail.com]
Sent: 07 April 2015 01:55
To: user@hadoop.apache.org
Subject: Pin Map/Reduce tasks to specific cores

Hello. My question, which can be found on Stack Overflow<http://stackoverflow.com/questions/29283213/core-affinity-of-map-tasks-in-hadoop>
as well, regards pinning map/reduce tasks to specific cores, either on hadoop v.1.2.1 or hadoop
v.2.
In specific, I would like to know if the end-user can have any control on which core executes
a specific map/reduce task.

To pin an application on linux, there's the "taskset" command, but is anything similar provided
by hadoop? If not, is the Linux Scheduler in charge of allocating tasks to specific cores?

------------------
Below I am providing two cases to better illustrate my question:
Case #1: 2 GiB input size, HDFS block size of 64 MiB and 2 compute nodes available, with 32
cores each.
As follows, 32 map tasks will be called; let's suppose that mapred.tasktracker.map.tasks.maximum
= 16, so 16 map tasks will be allocated to each node.
Can I guarantee that each Map Task will run on a specific core, or is it up to the Linux Scheduler?

------------------

Case #2: The same as case #1, but now the input size is 8 GiB, so there are not enough slots
for all map tasks (128), so multiple tasks will share the same cores.
Can I control how much "time" each task will spend on a specific core and if it will be reassigned
to the same core in the future?
Any information on the above would be highly appreciated.
Kind Regards,
George
Mime
View raw message