giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Steven Harenberg <sdhar...@ncsu.edu>
Subject Optimal configuration for Giraph on YARN
Date Wed, 18 Mar 2015 18:48:06 GMT
Hi all,

Previously with MapReduceV1, the suggestion was to have a 1:1
correspondence between workers and compute nodes (machines) and set the
number of the threads to be the number of cores per machines. To achieve
this configuration, we would set "mapred.tasktracker.map.tasks.maximum=1".
Since workers correspond to mappers this would ensure there was one worker
per machine.

Now I am reading that with Yarn this property longer exists as there aren't
tasktrackers. Instead, we have the global properties
"yarn.nodemanager.resource.cpu-vcores", which specifies the cores _per
node_, and the property "mapreduce.map.cpu.vcores", which specifies the
cores _per map task_.

If we want to have one mapper per node that is fully utilizing the machine,
I assume we should just set mapreduce.map.cpu.vcores =
yarn.nodemanager.resource.cpu-vcores = the # of cores per node. Is this
correct?

Do I still need to set giraph.numComputeThreads to be the number of cores
per node?

Thanks,
Steve

Mime
View raw message