flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Ewen <se...@apache.org>
Subject Re: Broadcasting sets in Flink Streaming
Date Tue, 25 Aug 2015 14:19:17 GMT
You can do something very similar like broadcast sets like this:

Use a Co-Map function and connect your main data set regularly ("forward"
partitioning) to one input and your broadcast set via "broadcast" to the
other input. You can then retrieve the data in the two map functions
separately.

This approach misses the logic that the broadcast data arrives fully before
the non-broadcast data (you may receive events from the main data set
before all broadcast data was received), but maybe you can work around
that...

On Tue, Aug 25, 2015 at 2:45 PM, Till Rohrmann <trohrmann@apache.org> wrote:

> Hi Tamara,
>
> I think this is not officially supported by Flink yet. However, I think
> that Gyula had once an example where he did something comparable. Maybe he
> can chime in here.
>
> Cheers,
> Till
>
> On Tue, Aug 25, 2015 at 11:15 AM, Tamara Mendt <tammymendt@gmail.com>
> wrote:
>
>> Hello,
>>
>> I have been trying to use the function withBroadcastSet on a
>> SingleOutputStreamOperator (map) the same way I would on a MapOperator for
>> a DataSet. From what I see, this cannot be done. I wonder if there is some
>> way to broadcast a DataSet to the tasks that are performing transformations
>> on a DataStream?
>>
>> I am basically pre-calculating some things with Flink which I later need
>> for the transformations on the incoming data from the stream. So I want to
>> broadcast the resulting datasets from the pre-calculations.
>>
>> Any ideas on how to best approach this?
>>
>> Thanks, cheers
>>
>> Tamara.
>>
>
>

Mime
View raw message