giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Anirudh Perugu <anirudh.per...@stonybrook.edu>
Subject Re: Running Giraph 1.1 on Hadoop 2.7.2
Date Sat, 05 Mar 2016 15:18:43 GMT
Sure Hassan, thanks for the reply!

On Fri, Mar 4, 2016 at 2:46 PM, Hassan Eslami <hsn.eslami@gmail.com> wrote:

> Anirudh,
>
> 1) AFAIK, the load balancing mechanism is not implemented in Giraph.
> Although, the mechanism for partition migration is implemented. You may
> want to use that mechanism to implement your own load-balancer insider the
> framework. You can take a look at BspServiceWorker#exchangeVertexPartitions
> for this purpose.
>
> 2) i. Look at PartitionUtils#computePartitionCount. Generally, if you have
> n machines, the number of partitions would be n*n (each worker will get n
> partitions). You can set the total number of partitions by flag
> -Dgiraph.userPartitionCount (for instance, you can say
> -Dgiraph.userPartitionCount=100, to have 100 partitions in total).
> ii. Number of partitions are generally remain constant throughout the
> computation. It is computed once in the beginning of the computation, and
> will be the same for the rest of the computation.
> iii. There are statistics (such as how many vertices each partition has,
> how much time it took to process each partition, etc. For instance you can
> look at PartitionStats class) which are mostly used for logging.
>
> Best,
> Hassan
>
> On Thu, Mar 3, 2016 at 1:46 PM, Anirudh Perugu <
> anirudh.perugu@stonybrook.edu> wrote:
>
>> Hi,
>>
>> I am a giraph newbie & have read how giraph works but I have a couple of
>> questions.
>>
>> 1. If a machine has too much work to do, is it possible to migrate work
>> to another machine for faster computation? (or is this handled by
>> partitions from the master)
>>
>> (Plz view the diagram below)
>> 2. i. How are the number of partitions decided?
>> ii. What kind of Statistics are stored, how do they help the master to
>> choose the number of partitions for the next superstep?
>> iii. These statistics are in memory (because they cannot be to the disk),
>> am I correct?
>> [image: Inline image 2]
>>
>> Thanks,
>> Anirudh
>>
>
>

Mime
View raw message