flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stavros Kontopoulos <st.kontopou...@gmail.com>
Subject Re: withBroadcastSet for a DataStream missing?
Date Fri, 01 Apr 2016 19:51:29 GMT
Ok thnx Till i will give it a shot!

On Thu, Mar 31, 2016 at 11:25 AM, Till Rohrmann <trohrmann@apache.org>
wrote:

> Hi Stavros,
>
> you might be able to solve your problem using a CoFlatMap operation with
> iterations. You would use one of the inputs for the iteration on which you
> broadcast the model updates to every operator. On the other input you would
> receive the data points which you want to cluster. As output you would emit
> the clustered points and model updates. Here you have to use the split
> and select function to split the output stream into model updates and
> output elements. It’s important to broadcast the model updates, otherwise
> not all operators have the same clustering model.
>
> Cheers,
> Till
> ​
>
> On Tue, Mar 29, 2016 at 7:23 PM, Stavros Kontopoulos <
> st.kontopoulos@gmail.com> wrote:
>
>> H i am new here...
>>
>> I am trying to implement online k-means as here
>> https://databricks.com/blog/2015/01/28/introducing-streaming-k-means-in-spark-1-2.html
>> with flink.
>> I dont see anywhere a withBroadcastSet call to save intermediate results
>> is this currently supported?
>>
>> Is intermediate results state saved somewhere like in this example a
>> viable alternative:
>>
>> https://github.com/StephanEwen/flink-demos/blob/master/streaming-state-machine/src/main/scala/com/dataartisans/flink/example/eventpattern/StreamingDemo.scala
>>
>> Thnx,
>> Stavros
>>
>
>

Mime
View raw message