flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Fabian Hueske <fhue...@gmail.com>
Subject Re: Broadcast state
Date Wed, 02 Oct 2019 09:29:27 GMT
Hi,

State is always associated with a single task in Flink.
The state of a task cannot be accessed by other tasks of the same operator
or tasks of other operators.
This is true for every type of state, including broadcast state.

Best, Fabian


Am Di., 1. Okt. 2019 um 08:22 Uhr schrieb Navneeth Krishnan <
reachnavneeth2@gmail.com>:

> Hi,
>
> I can use redis but I’m still having hard time figuring out how I can
> eliminate duplicate data. Today without broadcast state in 1.4 I’m using
> cache to lazy load the data. I thought the broadcast state will be similar
> to that of kafka streams where I have read access to the state across the
> pipeline. That will indeed solve a lot of problems. Is there some way I can
> do the same with flink?
>
> Thanks!
>
> On Mon, Sep 30, 2019 at 10:36 PM Congxian Qiu <qcx978132955@gmail.com>
> wrote:
>
>> Hi,
>>
>> Could you use some cache system such as HBase or Reids to storage this
>> data, and query from the cache if needed?
>>
>> Best,
>> Congxian
>>
>>
>> Navneeth Krishnan <reachnavneeth2@gmail.com> 于2019年10月1日周二 上午10:15写道:
>>
>>> Thanks Oytun. The problem with doing that is the same data will be have
>>> to be stored multiple times wasting memory. In my case there will around
>>> million entries which needs to be used by at least two operators for now.
>>>
>>> Thanks
>>>
>>> On Mon, Sep 30, 2019 at 5:42 PM Oytun Tez <oytun@motaword.com> wrote:
>>>
>>>> This is how we currently use broadcast state. Our states are re-usable
>>>> (code-wise), every operator that wants to consume basically keeps the same
>>>> descriptor state locally by processBroadcastElement'ing into a local state.
>>>>
>>>> I am open to suggestions. I see this as a hard drawback of dataflow
>>>> programming or Flink framework?
>>>>
>>>>
>>>>
>>>> ---
>>>> Oytun Tez
>>>>
>>>> *M O T A W O R D*
>>>> The World's Fastest Human Translation Platform.
>>>> oytun@motaword.com — www.motaword.com
>>>>
>>>>
>>>> On Mon, Sep 30, 2019 at 8:40 PM Oytun Tez <oytun@motaword.com> wrote:
>>>>
>>>>> You can re-use the broadcasted state (along with its descriptor) that
>>>>> comes into your KeyedBroadcastProcessFunction, in another operator
>>>>> downstream. that's basically duplicating the broadcasted state whichever
>>>>> operator you want to use, every time.
>>>>>
>>>>>
>>>>>
>>>>> ---
>>>>> Oytun Tez
>>>>>
>>>>> *M O T A W O R D*
>>>>> The World's Fastest Human Translation Platform.
>>>>> oytun@motaword.com — www.motaword.com
>>>>>
>>>>>
>>>>> On Mon, Sep 30, 2019 at 8:29 PM Navneeth Krishnan <
>>>>> reachnavneeth2@gmail.com> wrote:
>>>>>
>>>>>> Hi All,
>>>>>>
>>>>>> Is it possible to access a broadcast state across the pipeline? For
>>>>>> example, say I have a KeyedBroadcastProcessFunction which adds the
incoming
>>>>>> data to state and I have downstream operator where I need the same
state as
>>>>>> well, would I be able to just read the broadcast state with a readonly
>>>>>> view. I know this is possible in kafka streams.
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>

Mime
View raw message