spark-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhaokang Wang <>
Subject Strange behavior of collectNeighbors API in GraphX
Date Fri, 11 Mar 2016 10:11:47 GMT
Hi all,
These days I have        met a problem of GraphX鈥檚 strange behavior on
collectNeighbors        API. It seems that this API has side-effects on the
Pregel API.        It makes Pregel API not work as expected. The following
is a        small code demo to reproduce this strange behavior. You can get       
the whole source code in the attachment.
The steps to        reproduce the side-effects:
Create a little toy graph with a          simple vertex property type:          
 class VertexProperty(val inNeighbor:ArrayBuffer[Long] =
ArrayBuffer[Long]()) extends Serializable { } // Create a data graph.
Vertices:1,2,3; Edges:2 -&gt; 1, 3 -&gt; 1, 2 -&gt; 3....... val purGraph =
Graph(dataVertex, dataEdge).persist()
Call collectNeighbors            method to get both inNeighbor graph and
outNeighbor graph of            the purGraph.            Then outer join the
inNeighborGraph            with purGraph            to get the dataGraph:
// Get inNeighbor and outNeighbor graph from purGraph val inNeighborGraph =
purGraph.collectNeighbors(EdgeDirection.In) // !!May be BUG here!! If we
don't collect outneighbors here, the bug will disappear. val
outNeighborGraph = purGraph.collectNeighbors(EdgeDirection.Out) // Now join
the in neighbor vertex id list to every vertex's property val dataGraph =
purGraph.outerJoinVertices(inNeighborGraph)((vid, property, inNeighborList)
=&gt; {   val inNeighborVertexIds =
inNeighborList.getOrElse(Array[(VertexId, VertexProperty)]()).map(t =&gt;
t._1)   property.inNeighbor ++= inNeighborVertexIds.toBuffer   property })
Conduct a simple Pregel computation on dataGraph.            In the Pregel
vertex program phase, we do nothing but just            print some debug
info. However, in the send message phase,            we find that the
inNeighbor            property of vertex 1 has changed! The inNeighbor             
property values of vertex 1 are inconsistent between the              vertex
program phase and the send message phase!
val result = dataGraph.pregel(Array[Long](), maxIterations = 1,
EdgeDirection.Both)(  // vertex program  (id, property, msg) =&gt;
vertexProgram(id, property, msg),  // send messages  triplet =&gt;
sendMessage(triplet),  // combine messages  (a, b) =&gt; messageCombiner(a,
b))// In the vertex program, we change nothing...def vertexProgram(id:
VertexId, property: VertexProperty, msgSum: Array[Long]):VertexProperty = { 
if(id == 1L)     println("[Vertex Program]Vertex 1's inNeighbor property
length is:" + property.inNeighbor.length)  property}// In the send message
phase, we just check the vertex property of the same vertex. // We should
get the same results in the two phases.def
sendMessage(edge:EdgeTriplet[VertexProperty, Null]):Iterator[(VertexId,
Array[Long])]={// Print vertex 1's inNeighbor lengthif(edge.srcId == 1L) 
println("[Send Message] Vertex 1's inNeighbor property length is:" +
edge.srcAttr.inNeighbor.length)if(edge.dstId == 1L)  println("[Send Message]
Vertex 1's inNeighbor property length is:" +
edge.dstAttr.inNeighbor.length)val sIn = edge.srcAttr.inNeighborval dIn =
edge.dstAttr.inNeighbor//send empty
messageCombiner(a:Array[Long], b:Array[Long]):Array[Long]={
Array.concat(a,b) }
In the program output, we get:
[Vertex Program]Vertex 1's inNeighbor property length is:2[Send Message]
Vertex 1's inNeighbor property length is:0[Send Message] Vertex 1's
inNeighbor property length is:0
More weirdly, if we comment out one of the collectNeighbors           
method call, everything will be OK! As you may notice,            actually
we do not use outNeighborGraph            in our program, so we can comment
the following statement in            the program:
//  val outNeighborGraph = purGraph.collectNeighbors(EdgeDirection.Out)
 If we comment that statement out, you can            find that everything
is Okay now.
[Vertex Program]Vertex 1's inNeighbor property length is:2[Send Message]
Vertex 1's inNeighbor property length is:2[Send Message] Vertex 1's
inNeighbor property length is:2
The behavior of collectNeighbors        is strange. Maybe it is a bug of
GraphX or I call this API        improperly or my vertex property is
improper. Could you please        give some comments on this? Thanks a lot.
Zhaokang Wang
PS. We have tested        the code on Spark 1.5.1 and 1.6.0.
The source code mentioned in the post: Main2.scala

View this message in context:
Sent from the Apache Spark User List mailing list archive at
View raw message