giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Avery Ching <>
Subject Re: How does scaling work in Giraph?
Date Mon, 02 Jul 2012 08:00:52 GMT
Praveen, response inline.  Hope it's helpful.

On 6/30/12 10:47 AM, Praveen Sripati wrote:
> Could someone respond to the below mail please?
> Thanks,
> Praveen
> On Thu, Jun 28, 2012 at 7:04 PM, Praveen Sripati
> <>wrote:
>> During the 24th minute of the recent Hadoop Summit Video [1] Avery Ching
>> talks about how Giraph is made scalable. I am interested in Hama which is
>> also based on the BSP model and would like to know more details on how
>> Giraph is made scalable.
>> Basically, at the end of each super step, the BSP tasks sends some metrics
>> to the master and the master partitions the data in the most loaded BSP
>> tasks and uses the free map available slot to process them.
>> 1) Where is the code for the above logic? I am new to Giraph.
See BspWorker#finishSuperstep()

>> 2) What is the logic behind the partitioning of the data in the master
>> after the super step? Let's say that the data has been partitioned using
>> Hash partitioning.
See GraphPartitionerFactory
>> 3) Similarly will Giraph also scale down? Will the partitions be merged?

This is totally up to the implementation of GraphPartitionerFactory.

>> Thanks,
>> Praveen
>> [1] -

View raw message