Thanks for the info! I just looked into this and it is a nice feature. However, I am ideally trying to make external calls to my script so that I am obtaining metrics in a standard way while I compare multiple systems. In the other systems, these phases were much more clear so the implementation was quite simple.

After looking at the Giraph source further, I do think is the file I want. I now understand that superstep -1 is the input phase [1], which can be benchmarked in Similarly, the entire vertex computation phase can be benchmarked here too.


In general, have you had a look at the giraph.metrics.enable option? It prints out metrics after each superstep to each worker's system.out log.

Hi all,

I am attempting to measure some performance metrics (such as runtime, memory usage, network communication, etc.) using an external bash script that grabs some machine stats.

I am having difficulty figuring out where to externally call this script in Giraph. Particularly, I would like to call it at several key points in Giraph's execution, such as input/setup, beginning of computation, and output. The issue I am having is that I can't clearly figure out where to place the external calls because I can't figure out where these "phases" are actually happening in Giraph's source.

I also have the added difficulty that I only want this external script to be called for each machine/worker not for each thread. Meaning, it should not be inside the vertex computation code, for example.

Summary: my goal is to call an external script once per machine at the beginning of setup, computation (at/before superstep 0), and output.
  1. Is this possible?
  2. If so, could anyone please point me to where these phases are happening that would work for making such an external call? I am guessing this would be the MasterThread file, as this is where all the GiraphTimers are happening.
  3. Any general advice would be appreciated.

Thanks and regards,