flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Greg Hogan <c...@greghogan.com>
Subject Re: Looping over a DataSet and accesing another DataSet
Date Tue, 01 Nov 2016 16:13:40 GMT
By 'loop' do you refer to an iteration? The output of a bulk iteration is
processed as the input of the following iteration. Values updated in an
iteration are available in the next iteration just as values updated by an
operator are available to the following operator.

Your chosen algorithm may not be a good fit for distributed processing
frameworks like Flink, Spark, and Hadoop. You may need to recast your
problem into an appropriate, scalable algorithm. Both the Gelly and Machine
Learning libraries have good examples of efficient, scalable algorithms
(Flink's "examples" demonstrate specific functionality).


On Mon, Oct 31, 2016 at 8:52 AM, otherwise777 <wouter@onzichtbaar.net>

> Thank you for your reply, this is new information for me,
> Regarding the algorithm, i gave it a better look and i don't think it will
> work with joining. When looping over the Edge set (u,v) we need to be able
> to write and read A[u] and A[v]. If i join them it will create a new
> instances of that value and it doesn't matter if it's changed in one
> instance.
> For example i have the following edges:
>  u v
>  1 2
>  1 3
> With vertices and values:
>  1 a
>  2 b
>  3 c
> If i join them i get:
>  u v u' v'
>  1 2 a b
>  1 3 a c
> If i loop over the joined set and change the u' value of the first instance
> to "d" then in my next loop step it will be 'a'.
> --
> View this message in context: http://apache-flink-user-
> mailing-list-archive.2336050.n4.nabble.com/Looping-over-a-
> DataSet-and-accesing-another-DataSet-tp9778p9784.html
> Sent from the Apache Flink User Mailing List archive. mailing list archive
> at Nabble.com.

View raw message