flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@gmail.com>
Subject Re: static/dynamic lookups in flink streaming
Date Wed, 21 Dec 2016 17:52:26 GMT
You could read the map from a file in the open method of a RichMapFunction.
The open method is called before the first record is processed and can put
data into the operator state.

The downside of this approach is that the data is replicated in each
operator, i.e., each operator holds a full copy of the map.
On the other hand, you do not need to shuffle the data because each
parallel task can do the look-up.
If your <id, name> map is small, this would be the preferred approach.

Best, Fabian

2016-12-21 18:46 GMT+01:00 Meghashyam Sandeep V <vr1meghashyam@gmail.com>:

> As a follow up question, can we populate the operator state from an
> external source?
> My use case is as follows: I have a flink streaming process with Kafka as
> a source. I only have ids coming from kafka messages. My look ups
> (<id,name>) which is a static map come from a different source. I would
> like to use those lookups while applying operators on stream from Kafka.
> Thanks,
> Sandeep
> On Wed, Dec 21, 2016 at 6:17 AM, Fabian Hueske <fhueske@gmail.com> wrote:
>> OK, I see. Yes, you can do that with Flink. It's actually a very common
>> use case.
>> You can store the names in operator state and Flink takes care of
>> checkpointing the state and restoring it in case of a failure.
>> In fact, the operator state is persisted in the state backends you
>> mentioned before.
>> Best, Fabian
>> 2016-12-21 15:02 GMT+01:00 Meghashyam Sandeep V <vr1meghashyam@gmail.com>
>> :
>>> Hi Fabian,
>>> I meant look ups like IDs to names. For example if I have IDs coming
>>> through the stream and if I want to replace them with corresponding names
>>> stored in cache or somewhere within flink.
>>> Thanks,
>>> Sandeep
>>> On Dec 21, 2016 12:35 AM, "Fabian Hueske" <fhueske@gmail.com> wrote:
>>>> Hi Sandeep,
>>>> I'm sorry but I think I do not understand your question.
>>>> What do you mean by static or dynamic look ups? Do you want to access
>>>> an external data store and cache data?
>>>> Can you give a bit more detail about your use?
>>>> Best, Fabian
>>>> 2016-12-20 23:07 GMT+01:00 Meghashyam Sandeep V <
>>>> vr1meghashyam@gmail.com>:
>>>>> Hi there,
>>>>> I know that there are various state backends to persist state. Is
>>>>> there a similar way to persist static/dynamic look ups and use them while
>>>>> streaming the data in Flink?
>>>>> Thanks,
>>>>> Sandeep

View raw message