flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Ewen <se...@apache.org>
Subject Re: [DISCUSS] Gelly iteration abstractions
Date Wed, 01 Apr 2015 08:48:38 GMT
Wow, Vasia, a GC limit error in the Optimizer (pre-flight), not the runtime.

That is cannot have been easy to produce ;-) The program does not really
look that tricky, I need to take some closer look at this...

BTW: Before this algorithm runs well, we really need pinning intermediate
results in memory. Otherwise each step of the loop will execute its entire
history every time again. A way to mitigate this for now is to manually
break persist and re-read the intermediate graph. The ML library does that
in some points to break programs into shorter, more feasible portions.

Also: Do you cache the vertex count in the graph? May be worth doing,
otherwise all counts are computed twice (for current graph and for the
previous graph)

Stephan


On Mon, Mar 30, 2015 at 10:33 PM, Vasiliki Kalavri <
vasilikikalavri@gmail.com> wrote:

> Hi,
>
> I'm back looking into this and I tried out the for-loop approach that was
> suggested above.
>
> I implemented a simple algorithm, k-core, which computes the k-core of a
> graph by iteratively filtering out vertices with degree less than k.
> You can find the code in [1].
>
> Unfortunately, this is giving me the following error, with an example graph
> of 10 vertices:
>
> Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit
> exceeded
> at
>
> org.apache.flink.optimizer.plan.PlanNode.mergeBranchPlanMaps(PlanNode.java:106)
> at
>
> org.apache.flink.optimizer.plan.PlanNode.mergeBranchPlanMaps(PlanNode.java:99)
> at
>
> org.apache.flink.optimizer.plan.DualInputPlanNode.<init>(DualInputPlanNode.java:90)
> at
>
> org.apache.flink.optimizer.plan.DualInputPlanNode.<init>(DualInputPlanNode.java:69)
> at
>
> org.apache.flink.optimizer.operators.HashJoinBuildSecondProperties.instantiate(HashJoinBuildSecondProperties.java:81)
> at
>
> org.apache.flink.optimizer.dag.TwoInputNode.instantiate(TwoInputNode.java:607)
> at
>
> org.apache.flink.optimizer.dag.TwoInputNode.addLocalCandidates(TwoInputNode.java:557)
> at
>
> org.apache.flink.optimizer.dag.TwoInputNode.getAlternativePlans(TwoInputNode.java:478)
> at
>
> org.apache.flink.optimizer.dag.TwoInputNode.getAlternativePlans(TwoInputNode.java:309)
> at
>
> org.apache.flink.optimizer.dag.TwoInputNode.getAlternativePlans(TwoInputNode.java:309)
> at
>
> org.apache.flink.optimizer.dag.SingleInputNode.getAlternativePlans(SingleInputNode.java:249)
> at
>
> org.apache.flink.optimizer.dag.SingleInputNode.getAlternativePlans(SingleInputNode.java:249)
> at
>
> org.apache.flink.optimizer.dag.DataSinkNode.getAlternativePlans(DataSinkNode.java:204)
> at org.apache.flink.optimizer.Optimizer.compile(Optimizer.java:501)
> at org.apache.flink.optimizer.Optimizer.compile(Optimizer.java:403)
> at
> org.apache.flink.client.LocalExecutor.executePlan(LocalExecutor.java:162)
> at
>
> org.apache.flink.api.java.LocalEnvironment.execute(LocalEnvironment.java:51)
> at
>
> org.apache.flink.api.java.ExecutionEnvironment.execute(ExecutionEnvironment.java:797)
> at org.apache.flink.api.java.DataSet.count(DataSet.java:397)
> at org.apache.flink.graph.Graph.numberOfVertices(Graph.java:913)
> at org.apache.flink.graph.example.KCoreExample.main(KCoreExample.java:91)
>
> Basically, the first 3 iterations are executed normally, then the 4th gets
> stuck and eventually the program fails with the above error message.
> Any help / suggestion would be greatly appreciated ^^
>
> Cheers,
> -Vasia.
>
> [1]:
>
> https://github.com/vasia/flink/commit/62b065b69746e25b74bee84d210d5c5b4ee761bd
>
> On 23 February 2015 at 18:05, Vasiliki Kalavri <vasilikikalavri@gmail.com>
> wrote:
>
> > I see, thanks a lot for the answers!
> > To rephrase my original question, would it make sense to define a
> > closed-loop iteration where the state is the whole graph?
> >
> > If you want to take a look at the current implementation of DMST using
> > delta iteration, Andra has made a PR [1].
> > On a high-level, this algorithm does (more or less) the following:
> >
> > while there are more than one vertices in the graph
> >    for every vertex
> >       select the min-weight edge
> >       add all selected edges to the MST
> >    collapse vertices with the same root into one vertex (iterative
> > connected components-like step)
> >
> > What we are thinking is that we could express this with -maybe- a
> > bulk-like iteration on the Graph, inside which we can use Gelly methods.
> > Would it make sense or shall we go for a for-loop implementation instead?
> >
> > Thanks!
> > -V.
> >
> > [1]: http://github.com/apache/flink/pull/434
> >
> > On 23 February 2015 at 16:18, Stephan Ewen <sewen@apache.org> wrote:
> >
> >> Closed-loop iterations are much more efficient right now. Long for loops
> >> suffer from memory fragmentation (an issue that is in the list to fix).
> >>
> >> Also, closed loops can be stateful (delta iterations) and do not require
> >> task re-deployment.
> >>
> >> On Mon, Feb 23, 2015 at 4:15 PM, Vasiliki Kalavri <
> >> vasilikikalavri@gmail.com
> >> > wrote:
> >>
> >> > I see that's cool :-)
> >> > So, what is the advantage of closed-loop versus for-loop iterations?
> >> > Custom convergence criteria / aggregators and more efficient execution
> >> > plans?
> >> >
> >> > On 23 February 2015 at 15:01, Stephan Ewen <sewen@apache.org> wrote:
> >> >
> >> > > For loops are basically rolled out - they yield long execution
> plans.
> >> > >
> >> > > On Mon, Feb 23, 2015 at 2:44 PM, Vasiliki Kalavri <
> >> > > vasilikikalavri@gmail.com
> >> > > > wrote:
> >> > >
> >> > > > for-loop iterations could cover some cases, I guess, when the
> >> number of
> >> > > > iterations is known beforehand.
> >> > > > Are there currently any restrictions on what can be used inside
a
> >> > > for-loop?
> >> > > > How are they translated into execution plans?
> >> > > >
> >> > > > On 23 February 2015 at 13:08, Stephan Ewen <sewen@apache.org>
> >> wrote:
> >> > > >
> >> > > > > Some things may not work well as "closed-loop" iterations.
> >> > > > >
> >> > > > > Is it possible to express those as for-loop iterations?
> >> > > > >
> >> > > > > On Mon, Feb 23, 2015 at 1:03 PM, Vasiliki Kalavri <
> >> > > > > vasilikikalavri@gmail.com
> >> > > > > > wrote:
> >> > > > >
> >> > > > > > Hi Stephan,
> >> > > > > >
> >> > > > > > yes, this would work for the cases where an algorithm
only
> >> updates
> >> > > the
> >> > > > > > vertex values or only updates the edge values.
> >> > > > > >
> >> > > > > > What we would like to also support is
> >> > > > > > (a) algorithms where both vertices and edges are updated
in
> one
> >> > > > iteration
> >> > > > > > (b) algorithms where the graph structure changes from
one
> >> iteration
> >> > > to
> >> > > > > the
> >> > > > > > next and
> >> > > > > > (c) branching inside an iteration, i.e. executing a
different
> >> > > > "iteration
> >> > > > > > body" based on some condition.
> >> > > > > >
> >> > > > > > We can still currently implement those with regular
Flink
> >> iteration
> >> > > > > > operators, but the resulting code is not that nice
or
> efficient.
> >> > > > > > For example, if we want to update both edges and vertex
> values,
> >> we
> >> > > can
> >> > > > > > still create a solution set where the vertex values
are
> >> attached to
> >> > > > each
> >> > > > > > edge.
> >> > > > > > Regarding branching inside an iteration, we can use
an
> >> aggregator
> >> > > that
> >> > > > > > tracks the iteration phase, but then we need to somehow
make
> the
> >> > > > > different
> >> > > > > > phases to consist of the same operators and also check
the
> >> > branching
> >> > > > > > condition inside each UDF.
> >> > > > > >
> >> > > > > > Cheers,
> >> > > > > > V.
> >> > > > > >
> >> > > > > >
> >> > > > > > On 23 February 2015 at 11:05, Stephan Ewen <sewen@apache.org>
> >> > wrote:
> >> > > > > >
> >> > > > > > > As a workaround, it should always work to get
the Edge and
> >> Vertex
> >> > > > data
> >> > > > > > set
> >> > > > > > > from the graph and use the regular Fink iteration
operators?
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > On Sun, Feb 22, 2015 at 4:53 PM, Vasiliki Kalavri
<
> >> > > > > > > vasilikikalavri@gmail.com
> >> > > > > > > > wrote:
> >> > > > > > >
> >> > > > > > > > Hi,
> >> > > > > > > >
> >> > > > > > > > yes, I was referring to the parallel Boruvka
algorithm.
> >> There
> >> > are
> >> > > > > > several
> >> > > > > > > > ways to implement this one in Flink and I
believe that the
> >> one
> >> > > > > > described
> >> > > > > > > in
> >> > > > > > > > the paper (vertex-centric) is not the most
elegant one :)
> >> > > > > > > >
> >> > > > > > > > Andra is now working on an idea that uses
the delta
> >> iteration
> >> > > > > > abstraction
> >> > > > > > > > and we believe that it will be both more
efficient and
> >> easier
> >> > to
> >> > > > > > > > understand. It has the edges in the solution
set and the
> >> > vertices
> >> > > > in
> >> > > > > > the
> >> > > > > > > > workset, so it follows the pattern I describe
in (2) in my
> >> > > previous
> >> > > > > > > e-mail.
> >> > > > > > > > As a next step, we would like to see how
having an
> iteration
> >> > > > operator
> >> > > > > > > that
> >> > > > > > > > could update the whole graph -what I describe
as (3)-
> would
> >> > make
> >> > > > this
> >> > > > > > > even
> >> > > > > > > > nicer.
> >> > > > > > > >
> >> > > > > > > > Any ideas are highly welcome!
> >> > > > > > > >
> >> > > > > > > > Cheers,
> >> > > > > > > > V.
> >> > > > > > > >
> >> > > > > > > > On 22 February 2015 at 16:32, Andra Lungu
<
> >> > lungu.andra@gmail.com
> >> > > >
> >> > > > > > wrote:
> >> > > > > > > >
> >> > > > > > > > > Hi Alex,
> >> > > > > > > > >
> >> > > > > > > > > Vasia is talking about the second version(presented
> >> Friday)
> >> > of
> >> > > > > > Parallel
> >> > > > > > > > > Boruvka, which can be found here:
> >> > > > > > > > > https://github.com/TU-Berlin-DIMA/IMPRO-3.WS14/pull/59
> >> > > > > > > > >
> >> > > > > > > > > I will propose the third, non-Pregel
like approach
> >> directly
> >> > to
> >> > > > > Gelly
> >> > > > > > > > soon.
> >> > > > > > > > >
> >> > > > > > > > > If you have additional questions, I
will be happy to
> >> answer
> >> > > them.
> >> > > > > > > > >
> >> > > > > > > > > Andra
> >> > > > > > > > >
> >> > > > > > > > > On Sun, Feb 22, 2015 at 4:23 PM, Alexander
Alexandrov <
> >> > > > > > > > > alexander.s.alexandrov@gmail.com>
wrote:
> >> > > > > > > > >
> >> > > > > > > > > > Hi Vasia,
> >> > > > > > > > > >
> >> > > > > > > > > > I am trying to look at the problem
in more detail.
> Which
> >> > > > version
> >> > > > > of
> >> > > > > > > the
> >> > > > > > > > > MST
> >> > > > > > > > > > are you talking about?
> >> > > > > > > > > >
> >> > > > > > > > > > Right now in the Gelly repository
I can only find the
> >> SSSP
> >> > > > > example
> >> > > > > > > > > > (parallel Bellman-Ford) from Section
4.2 in [1].
> >> > > > > > > > > >
> >> > > > > > > > > > However, it seems that the issues
encountered by Andra
> >> are
> >> > > > > related
> >> > > > > > to
> >> > > > > > > > the
> >> > > > > > > > > > implementation of Parallel Boruvka
(Section 3.2 in
> >> [2]). Is
> >> > > > that
> >> > > > > > > > correct?
> >> > > > > > > > > >
> >> > > > > > > > > > Regards,
> >> > > > > > > > > > A.
> >> > > > > > > > > >
> >> > > > > > > > > > [1] http://www.vldb.org/pvldb/vol7/p1047-han.pdf
> >> > > > > > > > > > [2] http://www.vldb.org/pvldb/vol7/p577-salihoglu.pdf
> >> > > > > > > > > >
> >> > > > > > > > > > 2015-02-19 21:03 GMT+01:00 Vasiliki
Kalavri <
> >> > > > > > > vasilikikalavri@gmail.com
> >> > > > > > > > >:
> >> > > > > > > > > >
> >> > > > > > > > > > > Hello beautiful Flink people,
> >> > > > > > > > > > >
> >> > > > > > > > > > > during the past few days,
Andra and I have been
> >> > discussing
> >> > > > > about
> >> > > > > > > how
> >> > > > > > > > to
> >> > > > > > > > > > > extend Gelly's iteration methods.
> >> > > > > > > > > > >
> >> > > > > > > > > > > Alexander's course (and his
awesome students) has
> >> made it
> >> > > > > obvious
> >> > > > > > > > that
> >> > > > > > > > > > > vertex-centric iterations
are not the best fit for
> >> > > algorithms
> >> > > > > > which
> >> > > > > > > > > don't
> >> > > > > > > > > > > follow the common "propagate-update"
pattern. For
> >> > example,
> >> > > > > Andra
> >> > > > > > is
> >> > > > > > > > > > working
> >> > > > > > > > > > > on an implementation of Minimum
Spanning Tree, which
> >> > > requires
> >> > > > > > > > branching
> >> > > > > > > > > > > inside an iteration and also
requires a convergence
> >> check
> >> > > of
> >> > > > an
> >> > > > > > > > > internal
> >> > > > > > > > > > > iteration. Others also reported
similar issues [1,
> 2].
> >> > > Trying
> >> > > > > to
> >> > > > > > > fit
> >> > > > > > > > > such
> >> > > > > > > > > > > algorithms to the vertex-centric
model leads to long
> >> and
> >> > > ugly
> >> > > > > > code,
> >> > > > > > > > > e.g.
> >> > > > > > > > > > > aggregators to keep track
of algorithm phases,
> >> > duplicating
> >> > > > > data,
> >> > > > > > > etc.
> >> > > > > > > > > > >
> >> > > > > > > > > > > One limitation of the vertex-centric
and the
> upcoming
> >> GAS
> >> > > > model
> >> > > > > > is
> >> > > > > > > > that
> >> > > > > > > > > > > they both only allow the vertex
values to be updated
> >> in
> >> > > each
> >> > > > > > > > iteration.
> >> > > > > > > > > > > However, for some algorithms
we need to update the
> >> edge
> >> > > > values
> >> > > > > > and
> >> > > > > > > in
> >> > > > > > > > > > > others we need to update both.
In even more complex
> >> > > > situations
> >> > > > > > > (like
> >> > > > > > > > > > > Andra's MST) in some iterations
we need to update
> the
> >> > > vertex
> >> > > > > > values
> >> > > > > > > > and
> >> > > > > > > > > > in
> >> > > > > > > > > > > some iterations we need to
update the edge values.
> >> > > > > > > > > > > Another problem is that we
currently don't have a
> way
> >> to
> >> > > > allow
> >> > > > > > > > > different
> >> > > > > > > > > > > computational phases inside
an iteration. This is
> >> > something
> >> > > > > that
> >> > > > > > > > Giraph
> >> > > > > > > > > > > solves with master compute,
a function that is
> >> executed
> >> > > once
> >> > > > > > before
> >> > > > > > > > > each
> >> > > > > > > > > > > superstep and sets the computation
function.
> >> > > > > > > > > > >
> >> > > > > > > > > > > All that said, I believe that
we can solve most of
> >> these
> >> > > > issues
> >> > > > > > if
> >> > > > > > > we
> >> > > > > > > > > > > nicely expose Flink's iteration
operators in Gelly.
> I
> >> can
> >> > > see
> >> > > > > the
> >> > > > > > > > > > following
> >> > > > > > > > > > > cases:
> >> > > > > > > > > > >
> >> > > > > > > > > > > 1. Bulk & delta iterations
where the solution set is
> >> the
> >> > > > vertex
> >> > > > > > > > > dataset:
> >> > > > > > > > > > > this will be similar to vertex-centric
and GAS, but
> >> will
> >> > > > allow
> >> > > > > > more
> >> > > > > > > > > > > flexible dataflows inside
the iteration.
> >> > > > > > > > > > > 2. Bulk & delta iterations
where the solution set is
> >> the
> >> > > edge
> >> > > > > > > > dataset:
> >> > > > > > > > > > for
> >> > > > > > > > > > > the cases where we need to
update edge values.
> >> > > > > > > > > > > 3. Bulk & delta iterations
where the solution set is
> >> the
> >> > > > Graph:
> >> > > > > > > this
> >> > > > > > > > > will
> >> > > > > > > > > > > cover more complex cases,
where the algorithm
> updates
> >> > both
> >> > > > > > vertices
> >> > > > > > > > and
> >> > > > > > > > > > > edges or even adds/removes
vertices/edges, i.e.
> >> updates
> >> > the
> >> > > > > whole
> >> > > > > > > > > Graph.
> >> > > > > > > > > > >
> >> > > > > > > > > > > What do you think? I can see
1 & 2 being very easy
> to
> >> > > > > implement,
> >> > > > > > > but
> >> > > > > > > > I
> >> > > > > > > > > > > suspect 3 won't be that easy
(but so awesome to have
> >> ^^).
> >> > > > > > > > > > > Would it work the way a Graph
is represented now,
> i.e.
> >> > > with 2
> >> > > > > > > > DataSets?
> >> > > > > > > > > > >
> >> > > > > > > > > > > Any comment, idea, pointer
would be much
> appreciated!
> >> > Thank
> >> > > > you
> >> > > > > > ^^
> >> > > > > > > > > > >
> >> > > > > > > > > > > Cheers,
> >> > > > > > > > > > > -V.
> >> > > > > > > > > > >
> >> > > > > > > > > > > [1]:
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> http://apache-flink-incubator-user-mailing-list-archive.2336050.n4.nabble.com/Can-a-master-class-control-the-superstep-in-Flink-Spargel-td733.html
> >> > > > > > > > > > > [2]:
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> http://issues.apache.org/jira/browse/FLINK-1552?focusedCommentId=14325769&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14325769
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message