Hello Giraph User Community,


( I am re-posting this question – I think I tried posting this before I confirmed my registration.  Please pardon if this message is a duplicate )


This is my first post to this mailing list – I’m interested in learning more about Giraph and to do that I checked out the latest source code from https://svn.apache.org/repos/asf/giraph/trunk

and built it with maven.


I am now running the shortestPathBenchMark example that ships with Giraph and have a few “high-level” questions:

For the sake of this discussion, I am running the example with the following arguments:


hadoop jar giraph.jar org.apache.giraph.benchmark.ShortestPathsBenchmark -c 1 -e 3 -v -V 50000 -w 4


The example takes about 90 seconds to complete on my 4-node hadoop cluster and I don’t see any errors or issues.


1.      In computing a Dijkstra shortest path, we are looking for the shortest path from one node to another.  What does ShortestPathsBenchmark use as the “starting” node?  The “ending” node?

2.      What edge weights are being used?  The arguments don’t allow me to specify them.

3.      Does ShortestPathsBenchmark produce any output data inside HDFS upon completion of this example, or is the example purely meant to visually illustrate processing time on my cluster?

4.      Can I feed ShortestPathsBenchmark my own graph?

5.      In the example above, I have specified 3 edges per vertex.  If I were to specify only 2 edges per vertex, am I not effectively dealing with a graph that most closely resembles a “linked list”?  When I set –e=2, the processing time is still somewhat comparable to –e = 3.  Shouldn’t the graph be much simpler?   


I have seen the ShortestPathExample @



and I was planning on working through that example as well, but I thought I’d ask about the benchmarking example first.





Bence Magyar

BAE Systems 6 New England Executive Park, Burlington MA 01803 USA

Office: +1 (781) 262-4222

Mobile: +1 (781) 879-7557