flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Till Rohrmann <trohrm...@apache.org>
Subject Re: withBroadcastSet for a DataStream missing?
Date Thu, 31 Mar 2016 08:25:25 GMT
Hi Stavros,

you might be able to solve your problem using a CoFlatMap operation with
iterations. You would use one of the inputs for the iteration on which you
broadcast the model updates to every operator. On the other input you would
receive the data points which you want to cluster. As output you would emit
the clustered points and model updates. Here you have to use the split and
select function to split the output stream into model updates and output
elements. It’s important to broadcast the model updates, otherwise not all
operators have the same clustering model.


On Tue, Mar 29, 2016 at 7:23 PM, Stavros Kontopoulos <
st.kontopoulos@gmail.com> wrote:

> H i am new here...
> I am trying to implement online k-means as here
> https://databricks.com/blog/2015/01/28/introducing-streaming-k-means-in-spark-1-2.html
> with flink.
> I dont see anywhere a withBroadcastSet call to save intermediate results
> is this currently supported?
> Is intermediate results state saved somewhere like in this example a
> viable alternative:
> https://github.com/StephanEwen/flink-demos/blob/master/streaming-state-machine/src/main/scala/com/dataartisans/flink/example/eventpattern/StreamingDemo.scala
> Thnx,
> Stavros

View raw message