giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ing. Alessio Arleo" <ingar...@icloud.com>
Subject Re: "Local-only" aggregators
Date Thu, 26 Mar 2015 08:34:04 GMT
Thanks Claudio :) That’s exactly what I meant. Thanks for the hint!

~~~~~~~~~~~~~~~~~~~

Ing. Alessio Arleo

Dottorando in Ingegneria Industriale e dell’Informazione

Dottore Magistrale in Ingegneria Informatica e dell’Automazione
Dottore in Ingegneria Informatica ed Elettronica

Linkedin: it.linkedin.com/in/IngArleo <http://it.linkedin.com/in/IngArleo>
Skype: Ing. Alessio Arleo

Tel: +39 075 5853920
Cell: +39 349 0575782

~~~~~~~~~~~~~~~~~~~



> On 25 Mar 2015, at 18:25, Claudio Martella <claudio.martella@gmail.com> wrote:
> 
> Hi,
> 
> I'm not sure aggregators require necessarily high traffic. Aggregators are aggregated
locally on the worker before they are aggregated on the (corresponding) master worker.
> Anyway, assuming you want to proceed, my understanding is that you want vertices on the
same worker to share (aggregated) information. In that case, I'd suggest just using a WorkerContext.

> 
> Hope this helps.
> Claudio
> 
> On Wed, Mar 25, 2015 at 12:47 AM Alessio Arleo <ingarleo@icloud.com <mailto:ingarleo@icloud.com>>
wrote:
> Hello everybody
> 
> I was wondering if it was possible to extend the concept of aggregator from a “global”
to a “local-only” perspective. 
> 
> Normally, aggregators DO cause network traffic because of the cycle: Workers -> Aggregator
Owner-> MasterAggregator -> AggregatorOwner -> Workers
> 
> What if I’d like to fetch and aggregate values as I would normally do with aggregators
but without causing this traffic? Let’s assume this situation:
> 
> 1 - Define a custom partitioning class and let it partition the graph. This is the partition
used to assign vertices to workers. 
> 2 - in the computation class, every time che compute method is called on a vertex, the
data needed for computation is stored inside the vertex neighbours but also in non-neighbouring
vertices (think about Force Directed layout algorithm for example; to compute the forces,
is necessary the distance between neighbouring and not-neighbouring vertices, applying different
kind of forces).
> 	
> — Given that the compute class is computing on vertex X
> 	a - I pick information from X neighbours as I would normally do (iterating its edges
or the incoming messages)
> 	b - When it comes to non-neighbouring vertices I would like to use data from X worker
only.
> 
> The first thing I tried to understand before asking this question was: does this make
any sense? I am probably wrong, but this actually does. If I partition my graph to maximize
locality, what I am actually trying to do is to reduce the network traffic as much as possibile.

> 
> My doubt is that if I use aggregators to achieve the result the network traffic would
be heavy, probably losing the advantages of the initial partitioning. What if I could access
and modify an aggregator-like local data structure in the same fashion (i.e. “getAggregatedValue”)
but without broadcasting it (assuming that I do not need the aggregator to be accessible to
every worker)? Or could it be possibile to manually assign partition owners in order to minimise
network traffic (if I need to aggregate all values from vertices in partition 3 and 3 only,
I assign the partition 3 aggregator owner to partition 3 worker)?
> 
> I hope in your comprehension and I hope I somehow caught your attention, even if for
a brief moment. Ask me if something is not clear ;)
> 
> Cheers!
> 
> ~~~~~~~~~~~~~~~~~~~
> 
> Ing. Alessio Arleo
> 
> Dottorando in Ingegneria Industriale e dell’Informazione
> 
> Dottore Magistrale in Ingegneria Informatica e dell’Automazione
> Dottore in Ingegneria Informatica ed Elettronica
> 
> Linkedin: it.linkedin.com/in/IngArleo <http://it.linkedin.com/in/IngArleo>
> Skype: Ing. Alessio Arleo
> 
> Tel: +39 075 5853920
> Cell: +39 349 0575782
> 
> ~~~~~~~~~~~~~~~~~~~
> 
> 
> 


Mime
View raw message