flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Xingcan Cui <xingc...@gmail.com>
Subject Re: Questions about the V-C Iteration in Gelly
Date Thu, 09 Feb 2017 17:16:35 GMT
Hi Vasia,

thanks for your reply. It helped a lot and I got some new ideas.

a) As you said, I did use the getPreviousIterationAggregate() method in
preSuperstep() of the next superstep.
However, if the (only?) global (aggregate) results can not be guaranteed to
be consistency,  what should we
do with the postSuperstep() method?

b) Though we can active vertices by update method or messages, IMO, it may
be more proper for users
themselves to decide when to halt a vertex's iteration. Considering a
complex algorithm that contains different
phases inside a vertex-centric iteration. Before moving to the next phase
(that should be synchronized),
there may be some vertices that already finished their work in current
phase and they just wait for others.
Users may choose the finished vertices to idle until the next phase, but
rather than to halt them.
Can we consider adding the voteToHalt() method and some internal variables
to the Vertex/Edge class
(or just create an "advanced" version of them) to make the halting more
controllable?

c) Sorry that I didn't make it clear before. Here the initialization means
a "global" one that executes once
before the iteration. For example, users may want to initialize the
vertices' values by their adjacent edges
before the iteration starts. Maybe we can add an extra coGroupFunction to
the configuration parameters
and apply it before the iteration?

What do you think?

(BTW, I started a PR on FLINK-1526(MST Lib&Example). Considering the
complexity, the example is not
provided.)

Really appreciate for all your help.

Best,
Xingcan

On Thu, Feb 9, 2017 at 5:36 PM, Vasiliki Kalavri <vasilikikalavri@gmail.com>
wrote:

> Hi Xingcan,
>
> On 7 February 2017 at 10:10, Xingcan Cui <xingcanc@gmail.com> wrote:
>
>> Hi all,
>>
>> I got some question about the vertex-centric iteration in Gelly.
>>
>> a)  It seems the postSuperstep method is called before the superstep
>> barrier (I got different aggregate values of the same superstep in this
>> method). Is this a bug? Or the design is just like that?
>>
>
> ​The postSuperstep() method is called inside the close() method of a
> RichCoGroupFunction that wraps the ComputeFunction. The close() method It
> is called after the last call to the coGroup() after each iteration
> superstep.
> The aggregate values are not guaranteed to be consistent during the same
> superstep when they are computed. To retrieve an aggregate value for
> superstep i, you should use the getPreviousIterationAggregate() method in
> superstep i+1.
>
>
>>
>> b) There is not setHalt method for vertices. When no message received, a
>> vertex just quit the next iteration. Should I manually send messages (like
>> heartbeat) to keep the vertices active?
>>
>
> ​That's because vertex halting is implicitly controlled by the underlying
> delta iterations of Flink. ​A vertex will remain active as long as it
> receives a message or it updates its value, otherwise it will become
> inactive. The documentation on Gelly iterations [1] and DataSet iterations
> [2] might be helpful.
>
>
>
>>
>> c) I think we may need an initialization method in the ComputeFunction.
>>
>
>
> ​There exists a preSuperstep() method for initialization. This one will be
> executed once per superstep before the compute function is invoked for
> every vertex. Would this work for you?
>
>
>
>>
>> Any opinions? Thanks.
>>
>> Best,
>> Xingcan
>>
>>
>>
> ​I hope this helps,
> -Vasia.​
>
>
> ​[1]: https://ci.apache.org/projects/flink/flink-docs-
> release-1.2/dev/libs/gelly/iterative_graph_processing.
> html#vertex-centric-iterations
> [2]: https://ci.apache.org/projects/flink/flink-docs-
> release-1.2/dev/batch/iterations.html​
>
>

Mime
View raw message