giraph-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Maja Kabiljo (JIRA)" <>
Subject [jira] [Commented] (GIRAPH-273) Aggregators shouldn't use Zookeeper
Date Fri, 31 Aug 2012 18:33:08 GMT


Maja Kabiljo commented on GIRAPH-273:

I still don't see how constant difference of one connection per worker can make a problem,
if the problem wasn't already there. I do agree that in most of the applications the approach
which I described is not needed, but those are the cases in which even storing aggregators
on ZooKeeper was working fine, and basically what ever we do won't matter much. 

I already have the implementation for this, there are just a few smaller bugs I have to fix,
and also I have to wait for RPC to be removed first (I didn't want to leave a big mess with
two completely different code paths). In my implementation there is no need for another barrier
before sending aggregators, since they can be aggregated on the worker owner even before the
computation there is done. And worker waits to receive aggregated values from all the others
before sending them to master.
> Aggregators shouldn't use Zookeeper
> -----------------------------------
>                 Key: GIRAPH-273
>                 URL:
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Maja Kabiljo
>            Assignee: Maja Kabiljo
> We use Zookeeper znodes to transfer aggregated values from workers to master and back.
Zookeeper is supposed to be used for coordination, and it also has a memory limit which prevents
users from having aggregators with large value objects. These are the reasons why we should
implement aggregators gathering and distribution in a different way.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message