Mailing-List: contact giraph-user-help@incubator.apache.org; run by ezmlm
Precedence: bulk
Reply-To: giraph-user@incubator.apache.org
Received-SPF: pass (nike.apache.org: domain of jake.mannix@gmail.com
 designates 209.85.213.175 as permitted sender)
MIME-Version: 1.0
In-Reply-To: 
 <CADYHM8z5BwghJzua_wsZFx4TuD8nsOfN=Rau8LCUz8s1d783-Q@mail.gmail.com>
References: 
 <CADYHM8zOJv0UgK1T+0-ehO5V15CXupcGyOVSDMzBx4Rx5nnBbg@mail.gmail.com>
 <4EE1B704.2000600@apache.org>
 <CADYHM8z5BwghJzua_wsZFx4TuD8nsOfN=Rau8LCUz8s1d783-Q@mail.gmail.com>
From: Jake Mannix <jake.mannix@gmail.com>
Date: Fri, 9 Dec 2011 10:03:32 -0800
Message-ID: 
 <CACYXym98FcxBXOaeV+-LuV46QGtFFcTyDwNdZHAX2Ztna7ouLQ@mail.gmail.com>
Subject: Re: Comparing BSP and MR
To: giraph-user@incubator.apache.org
Content-Type: multipart/alternative; boundary=20cf300fb09f9a623404b3ac9b6a

--20cf300fb09f9a623404b3ac9b6a
Content-Type: text/plain; charset=ISO-8859-1

[hama-user to bcc:]

Let's not crosspost, please, it make the thread of conversation totally
opaque as to who is talking about what.

On Fri, Dec 9, 2011 at 1:42 AM, Praveen Sripati <praveensripati@gmail.com>wrote:

> Thanks to Thomas and Avery for the response.
>
> > For Giraph you are quite correct, all the stuff is submitted as a MR
> job. But a full map stage is not a superstep, the whole computation is a
> done in one mapping phase.
>
> So a map task in MR corresponds to a computation phase in a superstep.
> Once the computation phase for a superstep is complete, the vertex output
> is stored using the defined OutputFormat, the message sent (may be) to
> another vertex and the map task is stopped. Once the barrier
> synchronization phase is complete, another set of map tasks are invoked for
> the vertices which have received a message.
>

In Giraph, each superstep does not lead to storage into an OutputFormat.
 The data lives all in memory from the time the first superstep starts to
the time the final superstep stops (except that for tolerance of failures,
checkpoints are stored to disk at user-specified intervals).  There is only
one set of map tasks for the Giraph job - those long-running map tasks run
possibly many supersteps.


> In a regular MR Job (not Giraph) the number of Map tasks equals to the
> number of InputSplits. But, in case of Giraph the total number of maps to
> be launched is usually more than the number of input vertices.
>

Number of maps > number of input vertices?  Not at all.  That would be
insane.  We want to be able to run over multi-billion vertex graphs.  We're
going to launch multiple billions of mappers?   The splitting of the data
in Giraph is very similar to in a regular MR job, divide up your input data
among the number of mappers you have, and you're off and running.


>
> > Where are the incoming, outgoing messages and state stored
> > Memory
>
> What happens if a particular node is lost in case of Hama and Giraph? Are
> the messages not persisted somewhere to be fetched later.


If nodes are lost, the system has to back up to the most recent checkpoint,
where graph state has been persisted to HDFS.  Messages are not currently
persisted, but the state at which the graph was in to produce any messages
was.


> > In Giraph, vertices can move around workers between supersteps.  A
> vertex will run on the worker that it is assigned to.
>
> Is data locality considered while moving vertices around workers in Giraph?
>

Data is all in memory, and typical graph algorithms are basically sending
roughly the size of the entire graph (number of total edges) out over
distributed RPC in any given superstep, so shuffling the graph around by
RPC is not much more to do.


>
> > As you can see, you could write a MapReduce Engine with BSP on top of
> Apache Hama.
>
> It's being the done other way, BSP is implemented in Giraph using Hadoop.


I'll let the Hama people explain to you about how one would implement MR on
top of Hama.  You are correct that in Giraph, the Hadoop
JobTracker/TaskTracker and HDFS are used as substrate to help implement BSP
(although I would not say that "MR" is being used to implement BSP, as
there is no MR going on in Giraph).

  -jake


>
>
> Praveen
>
> On Fri, Dec 9, 2011 at 12:51 PM, Avery Ching <aching@apache.org> wrote:
>
>>  Hi Praveen,
>>
>> Answers inline.  Hope that helps!
>>
>> Avery
>>
>> On 12/8/11 10:16 PM, Praveen Sripati wrote:
>>
>> Hi,
>>
>> I know about MapReduce/Hadoop and trying to get myself around
>> BSP/Hama-Giraph by comparing MR and BSP.
>>
>> - Map Phase in MR is similar to Computation Phase in BSP. BSP allows for
>> process to exchange data in the communication phase, but there is no
>> communication between the mappers in the Map Phase. Though the data flows
>> from Map tasks to Reducer tasks. Please correct me if I am wrong. Any other
>> significant differences?
>>
>> I suppose you can think of it that way.  I like to compare a BSP
>> superstep to a MapReduce job since it's computation and communication.
>>
>> - After going through the documentation for Hama and Giraph, noticed that
>> they both use Hadoop as the underlying framework. In both Hama and Giraph
>> an MR Job is submitted. Does each superstep in BSP correspond to a Job in
>> MR? Where are the incoming, outgoing messages and state stored - HDFS or
>> HBase or Local or pluggable?
>>
>>  My understanding of Hama is that they have their own BSP framework.
>> Giraph can be run on a Hadoop installation, it does not have its own
>> computational framework.  A Giraph job is submitted to a Hadoop
>> installation as a Map-only job.  Hama will have its own BSP lauching
>> framework.
>>
>> In Giraph, the state is stored all in memory.  Graphs are loaded/stored
>> through VertexInputFormat/VertexOutputFormat (very similar to Hadoop).  You
>> could implement your own VertexInputFormat/VertexOutputFormat to use HDFS,
>> HBase, etc. as your graph stable storage.
>>
>> - If a Vertex is deactivated and again activated after receiving a
>> message, does is run on the same node or a different node in the cluster?
>>
>>  In Giraph, vertices can move around workers between supersteps.  A
>> vertex will run on the worker that it is assigned to.
>>
>> Regards,
>> Praveen
>>
>>
>>
>

--20cf300fb09f9a623404b3ac9b6a
Content-Type: text/html; charset=ISO-8859-1
Content-Transfer-Encoding: quoted-printable

[hama-user to bcc:]<div><br></div><div>Let&#39;s not crosspost, please, it =
make the thread of conversation totally opaque as to who is talking about w=
hat.<br><br><div class=3D"gmail_quote">On Fri, Dec 9, 2011 at 1:42 AM, Prav=
een Sripati <span dir=3D"ltr">&lt;<a href=3D"mailto:praveensripati@gmail.co=
m">praveensripati@gmail.com</a>&gt;</span> wrote:<br>

<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex;"><font size=3D"2"><font face=3D"verdana,sans=
-serif">Thanks to Thomas and Avery for the response.<br><br>&gt; For Giraph=
 you are quite correct, all the stuff is submitted as a MR job. But a full =
map stage is not a superstep, the whole computation is a done in one mappin=
g phase.<br>


<br>So a map task in MR corresponds to a computation phase in a superstep. =
Once the computation phase for a superstep is complete, the vertex output i=
s stored using the defined OutputFormat, the message sent (may be) to anoth=
er vertex and the map task is stopped. Once the barrier synchronization pha=
se is complete, another set of map tasks are invoked for the vertices which=
 have received a message.<br>

</font></font></blockquote><div><br></div><div>In Giraph, each superstep do=
es not lead to storage into an OutputFormat. =A0The data lives all in memor=
y from the time the first superstep starts to the time the final superstep =
stops (except that for tolerance of failures, checkpoints are stored to dis=
k at user-specified intervals). =A0There is only one set of map tasks for t=
he Giraph job - those long-running map tasks run possibly many supersteps.<=
/div>

<div>=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;=
border-left:1px #ccc solid;padding-left:1ex;"><font size=3D"2"><font face=
=3D"verdana,sans-serif">In a regular MR Job (not Giraph) the number of Map =
tasks equals to the number of InputSplits. But, in case of Giraph the total=
 number of maps to be launched is usually more than the number of input ver=
tices.<br>

</font></font></blockquote><div><br></div><div>Number of maps &gt; number o=
f input vertices? =A0Not at all. =A0That would be insane. =A0We want to be =
able to run over multi-billion vertex graphs. =A0We&#39;re going to launch =
multiple billions of mappers? =A0 The splitting of the data in Giraph is ve=
ry similar to in a regular MR job, divide up your input data among the numb=
er of mappers you have, and you&#39;re off and running.</div>

<div>=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;=
border-left:1px #ccc solid;padding-left:1ex;"><font size=3D"2"><font face=
=3D"verdana,sans-serif"><div class=3D"im">
<br>&gt; Where are the incoming, outgoing messages and state stored<br></di=
v>&gt; Memory<br><br>What happens if a particular node is lost in case of H=
ama and Giraph? Are the messages not persisted somewhere to be fetched late=
r.</font></font></blockquote>

<div><br></div><div>If nodes are lost, the system has to back up to the mos=
t recent checkpoint, where graph state has been persisted to HDFS. =A0Messa=
ges are not currently persisted, but the state at which the graph was in to=
 produce any messages was.</div>

<div>=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;=
border-left:1px #ccc solid;padding-left:1ex;"><font size=3D"2"><font face=
=3D"verdana,sans-serif"><div class=3D"im">&gt; In Giraph, vertices can move=
 around workers between supersteps.=A0 A vertex will run on the worker that=
 it is assigned to.<br>

<br></div>Is data locality considered while moving vertices around workers =
in Giraph?<br></font></font></blockquote><div><br></div><div>Data is all in=
 memory, and typical graph algorithms are basically sending roughly the siz=
e of the entire graph (number of total edges) out over distributed RPC in a=
ny given superstep, so shuffling the graph around by RPC is not much more t=
o do.</div>

<div>=A0</div><blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;=
border-left:1px #ccc solid;padding-left:1ex;"><font size=3D"2"><font face=
=3D"verdana,sans-serif"><br>&gt; As you can see, you could write a MapReduc=
e Engine with BSP on top of Apache Hama.<br>


<br>It&#39;s being the done other way, BSP is implemented in Giraph using H=
adoop.</font></font></blockquote><div><br></div><div>I&#39;ll let the Hama =
people explain to you about how one would implement MR on top of Hama. =A0Y=
ou are correct that in Giraph, the Hadoop JobTracker/TaskTracker and HDFS a=
re used as substrate to help implement BSP (although I would not say that &=
quot;MR&quot; is being used to implement BSP, as there is no MR going on in=
 Giraph).</div>

<div><br></div><div>=A0 -jake</div><div>=A0</div><blockquote class=3D"gmail=
_quote" style=3D"margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:=
1ex;"><font size=3D"2"><font face=3D"verdana,sans-serif"><span class=3D"HOE=
nZb"><font color=3D"#888888"><br>

<br>Praveen<br></font></span></font></font><div class=3D"HOEnZb"><div class=
=3D"h5"><br><div class=3D"gmail_quote">On Fri, Dec 9, 2011 at 12:51 PM, Ave=
ry Ching <span dir=3D"ltr">&lt;<a href=3D"mailto:aching@apache.org" target=
=3D"_blank">aching@apache.org</a>&gt;</span> wrote:<br>


<blockquote class=3D"gmail_quote" style=3D"margin:0 0 0 .8ex;border-left:1p=
x #ccc solid;padding-left:1ex">
 =20
   =20
 =20
  <div bgcolor=3D"#FFFFFF" text=3D"#000000"><div>
    Hi Praveen,<br>
    <br>
    Answers inline.=A0 Hope that helps!<br>
    <br>
    Avery<br>
    <br>
    On 12/8/11 10:16 PM, Praveen Sripati wrote:
    </div><div><blockquote type=3D"cite"><font size=3D"2"><font face=3D"ver=
dana,sans-serif"><span style=3D"font-family:verdana,sans-serif">Hi,</span><=
br style=3D"font-family:verdana,sans-serif">
          <br style=3D"font-family:verdana,sans-serif">
          <span style=3D"font-family:verdana,sans-serif">I know about
            MapReduce/Hadoop and trying to get myself around
            BSP/Hama-Giraph by comparing MR and BSP.</span><br style=3D"fon=
t-family:verdana,sans-serif">
          <br style=3D"font-family:verdana,sans-serif">
          <span style=3D"font-family:verdana,sans-serif">- Map Phase in
            MR is similar to Computation Phase in BSP. BSP allows for
            process to exchange data in the communication phase, but
            there is no communication between the mappers in the Map
            Phase. Though the data flows from Map tasks to Reducer
            tasks. Please correct me if I am wrong. Any other
            significant differences?</span></font></font><br>
    </blockquote></div><div>
    I suppose you can think of it that way.=A0 I like to compare a BSP
    superstep to a MapReduce job since it&#39;s computation and
    communication.<br>
    </div><blockquote type=3D"cite"><font size=3D"2"><font face=3D"verdana,=
sans-serif"><span style=3D"font-family:verdana,sans-serif">- After going
            through the documentation for Hama and Giraph, noticed that
            they both use Hadoop as the underlying framework. In both
            Hama and Giraph an MR Job is submitted. Does each superstep
            in BSP correspond to a Job in MR? Where are the incoming,
            outgoing messages and state stored - HDFS or HBase or Local
            or pluggable?</span><br style=3D"font-family:verdana,sans-serif=
">
          <br style=3D"font-family:verdana,sans-serif">
        </font></font></blockquote><div>
    My understanding of Hama is that they have their own BSP framework.=A0
    Giraph can be run on a Hadoop installation, it does not have its own
    computational framework.=A0 A Giraph job is submitted to a Hadoop
    installation as a Map-only job.=A0 Hama will have its own BSP lauching
    framework.=A0 <br>
    <br>
    In Giraph, the state is stored all in memory.=A0 Graphs are
    loaded/stored through VertexInputFormat/VertexOutputFormat (very
    similar to Hadoop).=A0 You could implement your own
    VertexInputFormat/VertexOutputFormat to use HDFS, HBase, etc. as
    your graph stable storage.<br>
    <br>
    </div><div><blockquote type=3D"cite"><font size=3D"2"><font face=3D"ver=
dana,sans-serif"><span style=3D"font-family:verdana,sans-serif">- If a Vert=
ex is
            deactivated and again activated after receiving a message,
            does is run on the same node or a different node in the
            cluster?</span><br style=3D"font-family:verdana,sans-serif">
          <br style=3D"font-family:verdana,sans-serif">
        </font></font></blockquote></div><div>
    In Giraph, vertices can move around workers between supersteps.=A0 A
    vertex will run on the worker that it is assigned to.<br>
    <br>
    <blockquote type=3D"cite"><font size=3D"2"><font face=3D"verdana,sans-s=
erif"><span style=3D"font-family:verdana,sans-serif">Regards,</span><br sty=
le=3D"font-family:verdana,sans-serif">
          <span style=3D"font-family:verdana,sans-serif">Praveen</span><br>
        </font></font>
    </blockquote>
    <br>
  </div></div>

</blockquote></div><br>
</div></div></blockquote></div><br></div>

--20cf300fb09f9a623404b3ac9b6a--