hadoop-mapreduce-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From George Ioannidis <giorgio...@gmail.com>
Subject Pin Map/Reduce tasks to specific cores
Date Mon, 06 Apr 2015 20:25:24 GMT
Hello. My question, which can be found on *Stack Overflow
<http://stackoverflow.com/questions/29283213/core-affinity-of-map-tasks-in-hadoop>*
as well, regards pinning map/reduce tasks to specific cores, either on
hadoop v.1.2.1 or hadoop v.2.
In specific, I would like to know if the end-user can have any control on
which core executes a specific map/reduce task.

To pin an application on linux, there's the "taskset" command, but is
anything similar provided by hadoop? If not, is the Linux Scheduler in
charge of allocating tasks to specific cores?

------------------

Below I am providing two cases to better illustrate my question:

*Case #1:* 2 GiB input size, HDFS block size of 64 MiB and 2 compute nodes
available, with 32 cores each.
As follows, 32 map tasks will be called; let's suppose that
mapred.tasktracker.map.tasks.maximum
= 16, so 16 map tasks will be allocated to each node.

Can I guarantee that each Map Task will run on a specific core, or is it up
to the Linux Scheduler?

------------------

*Case #2:* The same as case #1, but now the input size is 8 GiB, so there
are not enough slots for all map tasks (128), so multiple tasks will share
the same cores.
Can I control how much "time" each task will spend on a specific core and
if it will be reassigned to the same core in the future?

Any information on the above would be highly appreciated.

Kind Regards,
George

Mime
View raw message