giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Arjun Sharma <as469...@gmail.com>
Subject Re: Number of workers vs number of threads
Date Mon, 13 Jul 2015 09:22:35 GMT
I am not measuring RAM or CPU usage. I am just measuring the overall time
the job takes to finish on a large input. For assigning RAM to the workers,
I am using the job parameters -Dmapreduce.map.memory.mb=9300
-Dmapreduce.map.java.opts="-Xms9G -Xmx9G" (I am running on YARN).

On Mon, Jul 13, 2015 at 2:05 AM, Sonja Koenig <sonja.koenig@uni-ulm.de>
wrote:

> Hi there!
>
> On a related matter:
> May I ask you how you perform your measurements? Especially for capturing
> RAM and CPU usage..
> I also want to do some performance tests and I would be thankful to hear
> how you succeeded on that issue ;)
>
> Regards,
> Sonja
>
>
> Am 13.07.2015 um 10:56 schrieb Arjun Sharma:
>
>> Hi,
>>
>> Many of the discussions on this forum suggest using one worker per
>> physical machine, and increasing the number of threads per worker, versus
>> using multiple workers per physical machine, with a less number of threads.
>> This does not seem to be the case with my experiments.
>>
>> The cluster I am using has 12 physical machines (used exclusively for
>> workers), 64 GB of RAM and 12 cores each. I experimented with two setups:
>>
>> Setup 1 runs 72 workers (i.e., 6 workers per machine), 72*72 partitions,
>> which is the default, and 8 threads per worker.
>>
>> Setup 2 tries to simulate Setup 1, but using threads instead of workers.
>> Therefore, it has 12 workers (1 worker per machine), 72*72 partitions
>> (using numUserPartitions), and since the number of parallel tasks per
>> machine in Setup 1 is 6 workers * 8 threads, then the number of compute,
>> input, output threads is set to 48.
>>
>> In both cases 56 GB of RAM is assigned equally to all workers on the
>> machine (either given to the 1 worker on that machine or divided among 6 of
>> them).
>>
>> In my case, Setup 1 performs significantly better (faster) than Setup 2,
>> which sounds counter intuitive, and not agreeing with other suggestions of
>> using less number of workers, and more number of threads. Is there anything
>> I am missing here? Is there any kind of tuning or configuration parameter
>> setting that can make Setup 2 outperform Setup 1?
>>
>> Thanks!
>>
>
>

Mime
View raw message