flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Stephan Ewen <se...@apache.org>
Subject Re: load broadcast set into searchable set
Date Mon, 06 Oct 2014 11:13:32 GMT
You actually can check directly in the broadcast set. But since it is a
list, searching in is slower than in a hash set. That is all...

On Mon, Oct 6, 2014 at 1:09 PM, Martin Neumann <mneumann@spotify.com> wrote:

> Yes, thanks.
>
> I was wondering if I can check directly in the broadcast set but since I
> have to get it local anyway it should be not to much overhead.
>
> cheers Martin
>
> On Mon, Oct 6, 2014 at 12:34 PM, Stephan Ewen <sewen@apache.org> wrote:
>
> > Hej!
> >
> > Yes, the "getRuntimeEnvironment().getBroadcastVariable() returns a list,
> > which you can add to set:
> >
> >
> > // in the function:
> >
> > private Set<T> specials;
> >
> > public void open(Configuration conf) {
> >     List<T> bc =
> > getRuntimeContect().getBroadcastVariable("the-bc-var-name");
> >    specials = new HashSet<T>(bc);
> > }
> >
> >
> >
> > Is that what you had in mind?
> >
> > Stephan
> >
> >
> > On Mon, Oct 6, 2014 at 11:57 AM, Martin Neumann <mneumann@spotify.com>
> > wrote:
> >
> > > Hej,
> > >
> > > I have a Flink job with with a filter step. I now have a list of
> > exceptions
> > > where I need to do some extra work (300k data). I thought I just use a
> > > boradcast set and then for each like compare if its in the exception
> set.
> > >
> > > What is the best way to implement this in Flink? Is there an efficient
> > way
> > > of checking if a certain element is in the Broadcastset? Or can I
> somehow
> > > dump the Broadcastset into a set?
> > >
> > > cheers Martin
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message