giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Sonja Koenig <>
Subject Re: Number of workers vs number of threads
Date Mon, 13 Jul 2015 09:05:05 GMT
Hi there!

On a related matter:
May I ask you how you perform your measurements? Especially for 
capturing RAM and CPU usage..
I also want to do some performance tests and I would be thankful to hear 
how you succeeded on that issue ;)


Am 13.07.2015 um 10:56 schrieb Arjun Sharma:
> Hi,
> Many of the discussions on this forum suggest using one worker per 
> physical machine, and increasing the number of threads per worker, 
> versus using multiple workers per physical machine, with a less number 
> of threads. This does not seem to be the case with my experiments.
> The cluster I am using has 12 physical machines (used exclusively for 
> workers), 64 GB of RAM and 12 cores each. I experimented with two setups:
> Setup 1 runs 72 workers (i.e., 6 workers per machine), 72*72 
> partitions, which is the default, and 8 threads per worker.
> Setup 2 tries to simulate Setup 1, but using threads instead of 
> workers. Therefore, it has 12 workers (1 worker per machine), 72*72 
> partitions (using numUserPartitions), and since the number of parallel 
> tasks per machine in Setup 1 is 6 workers * 8 threads, then the number 
> of compute, input, output threads is set to 48.
> In both cases 56 GB of RAM is assigned equally to all workers on the 
> machine (either given to the 1 worker on that machine or divided among 
> 6 of them).
> In my case, Setup 1 performs significantly better (faster) than Setup 
> 2, which sounds counter intuitive, and not agreeing with other 
> suggestions of using less number of workers, and more number of 
> threads. Is there anything I am missing here? Is there any kind of 
> tuning or configuration parameter setting that can make Setup 2 
> outperform Setup 1?
> Thanks!

View raw message