giraph-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Avery Ching <>
Subject Re: Message processing
Date Fri, 09 Sep 2011 16:03:08 GMT

The model you're describing sounds a bit like a hybrid between BSP and 
asynchronous, stream based computing.  Definitely would be great to 
experiment with either of these a bit.  I too would prefer to eliminate 
any complicated locking models (i.e. a distributed lock manager for all 
vertices).  But it might come to that if we decide that asynchronous 
remote vertex mutation is important.  I think an asynchronous model 
could provide performance benefits over BSP in some cases.  But 
debugging would be more difficult (less deterministic).  So I believe 
both models will be useful for Giraph.


On 9/9/11 8:26 AM, Jake Mannix wrote:
> On Fri, Sep 9, 2011 at 8:03 AM, Avery Ching < 
> <>> wrote:
>     The GraphLab model is more asynchronous than BSP  They allow you
>     to update your neighbors rather than the BSP model of messaging
>     per superstep.  Rather than one massive barrier in BSP, they
>     implement this with vertex locking.  They also all a vertex to
>     modify the state of its neighbors.  We could certainly add
>     something similar as an alternative computing model, perhaps
>     without locking.  Here's one idea:
>     1) No explicit supersteps (asynchronous)
> This sounds interesting, esp. for streaming algorithms, although I was 
> thinking something slightly less ambitions to start out: still have 
> supersteps (effectively) which describe when each vertex has had a 
> chance to send all messages it wants for this iteration, and has 
> processed all inbound messages.
>     2) All vertices execute compute() (and may or may not send
>     messages) initially
>     3) Vertices can examine their neighbors or any vertex in the graph
>     (issue RPCs to get their state)
> "or any vertex in the graph" sounds pretty scary, but yes, powerful. 
>  I like it when my seemingly radical ideas get made look not so scary 
> by comparison! :)
>     4) When messages are received by a vertex, compute() is executed
>     on it (and state is locally locked to compute only)
>     5) Vertices stlll vote to halt when done, indicating the end of
>     the application.
>     6) Combiners can still be used to reduce the number of messages
>     sent (and the number of times compute is executed).
>     This could be fun.  And provide an interesting comparison platform
>     barrier based vs vertex based synchronization.
> Yeah, I think locking is an implementation detail which might be even 
> avoidable: if Vertices are effectively given a messageQueue which they 
> can process from, we could interpolate between buffering and 
> processing messages synchonously.  The per-mapper threading model 
> could get... interesting!
>   -jake

View raw message