Mailing-List: contact giraph-user-help@incubator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: giraph-user@incubator.apache.org
Received-SPF: pass (nike.apache.org: domain of jake.mannix@gmail.com
 designates 209.85.213.175 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CAEVHzWC8b-7RiBjkDiQKjiu-rVBz9=ogEOajXHbCLCR5n3+QVg@mail.gmail.com>
References: 
 <CACYXym8RhDjoscp+8mkbcpzw=r87ujxKx_F03aYeWoYDxSLh8w@mail.gmail.com>
 <4E6927D7.2070304@apache.org>
 <CACYXym9MQAF6sR8t7or=2UdjQAH25zWDQtD_t3_eHCTfbH+waw@mail.gmail.com>
 <4E692D2D.1050000@apache.org>
 <CACYXym9D3BxBCtZ3x-VPqBQtZDCcd9HQtbZ3VNDjvW+cJqtCNA@mail.gmail.com>
 <4E6934B1.6010908@apache.org>
 <CAFJOoJeSovM-No+5BpF+aAf4jWxX5OGQyTJ5N3P4=Db0GW8NCg@mail.gmail.com>
 <CACYXym8GhbUCiwPeDcFLEDw+NH0VTCG8jPM2XMbiGvVJHCpfcQ@mail.gmail.com>
 <CAEVHzWC8b-7RiBjkDiQKjiu-rVBz9=ogEOajXHbCLCR5n3+QVg@mail.gmail.com>
From: Jake Mannix <jake.mannix@gmail.com>
Date: Fri, 9 Sep 2011 08:26:13 -0700
Message-ID: 
 <CACYXym9cpBr0kXDmkyUHkgnj-MR+nHjnRkiw2G3x6KqhrYXQfw@mail.gmail.com>
Subject: Re: Message processing
To: giraph-user@incubator.apache.org
Content-Type: multipart/alternative; boundary=000e0cd2537c5f690004ac83cd43

--000e0cd2537c5f690004ac83cd43
Content-Type: text/plain; charset=ISO-8859-1

On Fri, Sep 9, 2011 at 8:03 AM, Avery Ching <aching@apache.org> wrote:

> The GraphLab model is more asynchronous than BSP  They allow you to update
> your neighbors rather than the BSP model of messaging per superstep.  Rather
> than one massive barrier in BSP, they implement this with vertex locking.
>  They also all a vertex to modify the state of its neighbors.  We could
> certainly add something similar as an alternative computing model, perhaps
> without locking.  Here's one idea:
>
> 1) No explicit supersteps (asynchronous)
>

This sounds interesting, esp. for streaming algorithms, although I was
thinking something slightly less ambitions to start out: still have
supersteps (effectively) which describe when each vertex has had a chance to
send all messages it wants for this iteration, and has processed all inbound
messages.


> 2) All vertices execute compute() (and may or may not send messages)
> initially
> 3) Vertices can examine their neighbors or any vertex in the graph (issue
> RPCs to get their state)
>

"or any vertex in the graph" sounds pretty scary, but yes, powerful.  I like
it when my seemingly radical ideas get made look not so scary by comparison!
:)


> 4) When messages are received by a vertex, compute() is executed on it (and
> state is locally locked to compute only)
> 5) Vertices stlll vote to halt when done, indicating the end of the
> application.
> 6) Combiners can still be used to reduce the number of messages sent (and
> the number of times compute is executed).
>
> This could be fun.  And provide an interesting comparison platform barrier
> based vs vertex based synchronization.
>

Yeah, I think locking is an implementation detail which might be even
avoidable: if Vertices are effectively given a messageQueue which they can
process from, we could interpolate between buffering and processing messages
synchonously.  The per-mapper threading model could get... interesting!

  -jake

--000e0cd2537c5f690004ac83cd43
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

<br><div class=3D"gmail_quote">On Fri, Sep 9, 2011 at 8:03 AM, Avery Ching =
<span dir=3D"ltr">&lt;<a href=3D"mailto:aching@apache.org">aching@apache.or=
g</a>&gt;</span> wrote:<br><blockquote class=3D"gmail_quote" style=3D"margi=
n:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex;">

The GraphLab model is more asynchronous than BSP =A0They allow you to updat=
e your neighbors rather than the BSP model of messaging per superstep. =A0R=
ather than one massive barrier in BSP, they implement this with vertex lock=
ing. =A0They also all a vertex to modify the state of its neighbors. =A0We =
could certainly add something similar as an alternative computing model, pe=
rhaps without locking. =A0Here&#39;s one idea:<div>


<br></div><div>1) No explicit supersteps (asynchronous)</div></blockquote><=
div><br></div><div>This sounds interesting, esp. for streaming algorithms, =
although I was thinking something slightly less ambitions to start out: sti=
ll have supersteps (effectively) which describe when each vertex has had a =
chance to send all messages it wants for this iteration, and has processed =
all inbound messages.</div>

<div>=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;=
border-left:1px #ccc solid;padding-left:1ex;"><div>2) All vertices execute =
compute() (and may or may not send messages) initially</div><div>3) Vertice=
s can examine their neighbors or any vertex in the graph (issue RPCs to get=
 their state)=A0</div>

</blockquote><div><br></div><div>&quot;or any vertex in the graph&quot; sou=
nds pretty scary, but yes, powerful. =A0I like it when my seemingly radical=
 ideas get made look not so scary by comparison! :)</div><div>=A0</div><blo=
ckquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #c=
cc solid;padding-left:1ex;">


<div>4) When messages are received by a vertex, compute() is executed on it=
 (and state is locally locked to compute only)</div><div>5) Vertices stlll =
vote to halt when done, indicating the end of the application.</div><div>


6) Combiners can still be used to reduce the number of messages sent (and t=
he number of times compute is executed).</div><div><br></div><div>This coul=
d be fun. =A0And provide an interesting comparison platform barrier based v=
s vertex based synchronization.</div>

</blockquote><div><br></div><div>Yeah, I think locking is an implementation=
 detail which might be even avoidable: if Vertices are effectively given a =
messageQueue which they can process from, we could interpolate between buff=
ering and processing messages synchonously. =A0The per-mapper threading mod=
el could get... interesting!</div>

<div>=A0</div><div>=A0 -jake</div></div>

--000e0cd2537c5f690004ac83cd43--