I have obtained new results for larger graph (undirected RMAT, vertex # 2^24, avrg. degree 32). Here are 
the results (only computation time included):

Workers        2^20           2^24
1                  274.155         n/a
2                  103.843     11517.708
4                  47.619       815.444
8                  29.236       344.801
16                32.681       205.311
32                47.754       175.448

* For graph size 2^24 single worker failed on the superstep 3
  I did not understand the reason.
* For graph size 2^24 2 workers worked unnaturaly very slow (4hrs, 13min, 35sec). I monitored utilization of CPU, it was also very low i.e. ~5-8% (during computation time).
  I have not found out reason yet. It looks like swapping, but I need to check it. I dont know whether hadoop processes can swap if I have set mapred.child.java.opts to -Xmx32g 
  (I have 80G per node in the cluster).
* Scalabitlity of 2^32 is poor starting from 8 nodes.

Scalability plot and superstep time distribution plot are attached.

Next, I am going to try run greater graphs (2^26) and using of multithreading. Also I am going to increase worker number.


On Fri, Feb 7, 2014 at 8:48 PM, Alexander Frolov <alexndr.frolov@gmail.com> wrote:
I forgot to note that in time scale it is only included time of computation (e.i. sum of superstep times).

Yes, this is not a big graph, I will come up with larger graphs soon.


On Fri, Feb 7, 2014 at 7:41 PM, Alexander Frolov <alexndr.frolov@gmail.com> wrote:
Undirected RMAT graph, generated by tool extracted from Graph500. Size is 2^20 vertices, average degree is 32. 

On Fri, Feb 7, 2014 at 5:35 PM, Claudio Martella <claudio.martella@gmail.com> wrote:
looks like a very small graph. what's the size of the graph and the topology?

On Fri, Feb 7, 2014 at 3:11 PM, Alexander Frolov <alexndr.frolov@gmail.com> wrote:
Hi, team!

As I have read in previous threads, I've started evaluation of Giraph on IB-cluster. So here I want to share my results (in case it will be useful for anybody) and ask for your ideas of further improving of performance characteristics.

Test system:
* 8 Nodes, with dual Intel Xeon CPU E5-2630 (6 cores/CPU), 80GB
* Infiniband FDR Dual-Port 4x
* SUSE 11.2
* jdk1.7.0_51

At the moment I am performing experiment with SimpleShortestPathsComputation test on generated RMAT graph. I attach plot wich shows scalability of Giraph up to 32 workers.

As can be seen from the plot up to 8 workers there is almost linear scalability and then (from 8 to 32) speed is not going up. For me it seems strange that using additional cores on nodes wont bring any performance gain to the execution time. Have anybody meet with such behaviour?

Next I am going to use threads instead of workers for cores utilization.  Also I am going to switch to the Hadoop-RDMA project.

If anybody has any suggestion how I can achieve maximum performance on Giraph on the cluster, I will be obliged to you ;-)

Hope for your feedback.


   Claudio Martella