giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arjun Sharma <as469...@gmail.com>
Subject Re: Optimal configuration for Giraph on YARN
Date Fri, 24 Apr 2015 00:15:06 GMT
Just bumping up this thread, as I am having the same question as Steven's.

Steven, did you get to know if setting both mapreduce.map.cpu.vcores and
yarn.nodemanager.resource.cpu-vcores is required? What happens if they are
not set, while giraph.numComputeThreads is set? Are there any
other parameters that must be set in order to make sure we are *really*
using the cores, not just multi-threading on a single core?


On Wed, Mar 18, 2015 at 11:48 AM, Steven Harenberg <sdharenb@ncsu.edu>
wrote:

> Hi all,
>
> Previously with MapReduceV1, the suggestion was to have a 1:1
> correspondence between workers and compute nodes (machines) and set the
> number of the threads to be the number of cores per machines. To achieve
> this configuration, we would set "mapred.tasktracker.map.tasks.maximum=1".
> Since workers correspond to mappers this would ensure there was one worker
> per machine.
>
> Now I am reading that with Yarn this property longer exists as there
> aren't tasktrackers. Instead, we have the global properties
> "yarn.nodemanager.resource.cpu-vcores", which specifies the cores _per
> node_, and the property "mapreduce.map.cpu.vcores", which specifies the
> cores _per map task_.
>
> If we want to have one mapper per node that is fully utilizing the
> machine, I assume we should just set mapreduce.map.cpu.vcores =
> yarn.nodemanager.resource.cpu-vcores = the # of cores per node. Is this
> correct?
>
> Do I still need to set giraph.numComputeThreads to be the number of cores
> per node?
>
> Thanks,
> Steve
>

Mime
View raw message