giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Pavan Kumar A <>
Subject RE: input superstep of giraph.
Date Fri, 18 Apr 2014 18:32:40 GMT
Btw, the jobs we run typically run for hours, so total time is mostly just sum of input + supersteps
for us, since a very little extra time is negligible. However, I see your job itself is so
small, so to be more accuratetotal time = time between start of your job (once all machines
were allocated) and the end of your job (when u can see in logs that all workers are done
with the last superstep)

Subject: RE: input superstep of giraph.
Date: Fri, 18 Apr 2014 23:58:53 +0530

Please take a look at GIRAPH-838Note that there is a little window between end of one superstep
& start of the other. So, this 120 s can be accounted for that. But what I meant was total
time is as good as sum of input + other supersteps (though only approximately because of this
slight extra time)

Date: Fri, 18 Apr 2014 16:28:48 +0100
Subject: Re: input superstep of giraph.

Hi Pavan, 
I might have misunderstood your explanation. But from the giraph timers I received, Total
does not seem to be:

 sum of input time + sum of time in all supersteps

For example the following timers were outputted after I ran the ConnectedComponents algorithm:

 	Giraph Timers
		Initialize (ms)=424		Input superstep (ms)=1457		Setup (ms)=85
		Shutdown (ms)=11666		Superstep 0 ConnectedComponentsComputation (ms)=903		Superstep 1 ConnectedComponentsComputation
		Superstep 10 ConnectedComponentsComputation (ms)=475		Superstep 11 ConnectedComponentsComputation
		Superstep 12 ConnectedComponentsComputation (ms)=342		Superstep 2 ConnectedComponentsComputation
		Superstep 3 ConnectedComponentsComputation (ms)=1399		Superstep 4 ConnectedComponentsComputation
		Superstep 5 ConnectedComponentsComputation (ms)=591		Superstep 6 ConnectedComponentsComputation
		Superstep 7 ConnectedComponentsComputation (ms)=458		Superstep 8 ConnectedComponentsComputation
(ms)=483		Superstep 9 ConnectedComponentsComputation (ms)=458
		Total (ms)=27675
Input superstep = sum of input time? 
Input superstep + sum of supersteps = 1457 + 14463                                       
               = 15920

and the total is: 27675
there is still 11755 ms unaccounted for? 
Or have I miss understood what sum of input time should be? 

Kind regards, 


On Fri, Apr 18, 2014 at 3:52 PM, ghufran malik <> wrote:

Thank you for the explanation :) 
It was confusing when reading it, some of the timers I can intuitively understand, however
I think it would be beneficial if these explanations were added to the API docs, then if anyone
else is confused they can look up the meanings there.


On Fri, Apr 18, 2014 at 3:25 PM, Pavan Kumar A <> wrote:

I wrote the Initialize counter :) Please tell me if the name seems confusing
So,Initialize = the time spent by job waiting for resources. In a shared pool the job you
launch may not get all the machines needed to start the job. So for instance you want to run
a job with 200 workers, giraph does not start until all the workers have are allocated &
register with the master.

Setup = once you have all the machines allocated, how much time it takes before starting to
read input
Shutdown = once you have written your output howmuch time it takes to stop verify that everything
is done & shutdown resources & notify user - for instance wait for all network connections
to close, all threads to join, etc.

Total = sum of input time + sum of time in all superstepsi.e., actual time taken to run by
your application after it got all the resources (does not include time waiting to get resources
which is initialize or shutdown time)

Date: Fri, 18 Apr 2014 13:28:47 +0100
Subject: Re: input superstep of giraph.


Could you also explain what the following timers correspond to as well please: 

Giraph Timers		Initialize (ms)=775

				Setup (ms)=105
		Shutdown (ms)=12537

		Total (ms)=27075                


On Thu, Apr 17, 2014 at 9:10 PM, Pavan Kumar A <> wrote:

Input consists of > reading the input (vertices and/or edges as provided) into memory on
individual workers> assigning vertices to partitions and partitions to workers

> moving all partitions (i.e., vertices & their out-edges) to a worker (which owns
the partition)> doing some bookkeeping of internal data-structures to be used during computation

Date: Thu, 17 Apr 2014 10:06:03 -0500

Subject: input superstep of giraph.

  From the screen output of a successful giraph program run, what does the following line

Input superstep (ms)=22884

 Does it mean the time used to load the input graph into memory? Thanks.

  Best Regards,



View raw message