flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aljoscha Krettek <aljos...@apache.org>
Subject Re: Flink streaming. Broadcast reference data map across nodes
Date Sun, 26 Feb 2017 13:55:48 GMT
Hi,
what Ufuk said is valid. In addition, you can make your function a
RichFunction and load the static data in the open() method.

In the future, you might be able to handle this use case with a feature
called side inputs that we're currently working on:
https://docs.google.com/document/d/1hIgxi2Zchww_5fWUHLoYiXwSBXjv-M5eOv-MKQYN3m4/edit

Best,
Aljoscha

On Tue, 21 Feb 2017 at 15:50 Ufuk Celebi <uce@apache.org> wrote:

> On Tue, Feb 21, 2017 at 2:35 PM, Vadim Vararu <vadim.vararu@adswizz.com>
> wrote:
> > Basically, i have a big dictionary of reference data that has to be
> > accessible from all the nodes (in order to do some joins of log line with
> > reference line).
>
> If the dictionary is small you can make it part of the closures that
> are send to the task managers. Just make it part of your function.
>
> If it is large, I'm not sure what the best way is to do it is right
> now. I've CC'd Aljoscha who can probably help...
>

Mime
View raw message