flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gyula Fóra <gyula.f...@gmail.com>
Subject Re: Keep Model in Operator instance up to date
Date Wed, 19 Aug 2015 07:37:40 GMT
Hey,

One input of your co-flatmap would be model updates and the other input
would be events to check against the model if I understand correctly.

This means that if your model updates come from more than one stream you
need to union them into a single stream before connecting them with the
event stream and applying the coatmap.

DataStream updates1 = ....
DataStream updates2 = ....
DataStream events = ....

events.connect(updates1.union(updates2).broadcast()).flatMap(...)

Does this answer your question?

Gyula


On Wednesday, August 19, 2015, Welly Tambunan <if05041@gmail.com> wrote:

> Hi Gyula,
>
> Thanks for your response.
>
> However the model can received multiple event for update. How can we do
> that with co-flatmap as i can see the connect API only received single
> datastream ?
>
>
> > ... while external model updates would be tricky to keep consistent.
> Is that still the case if the Operator treat the external model as
> read-only ? We create another stream that will update the external model
> separately.
>
> Cheers
>
> On Wed, Aug 19, 2015 at 2:05 PM, Gyula Fóra <gyfora@apache.org
> <javascript:_e(%7B%7D,'cvml','gyfora@apache.org');>> wrote:
>
>> Hey!
>>
>> I think it is safe to say that the best approach in this case is creating
>> a co-flatmap that will receive updates on one input. The events should
>> probably be broadcasted in this case so you can check in parallel.
>>
>> This approach can be used effectively with Flink's checkpoint mechanism,
>> while external model updates would be tricky to keep consistent.
>>
>> Cheers,
>> Gyula
>>
>>
>>
>>
>> On Wed, Aug 19, 2015 at 8:44 AM Welly Tambunan <if05041@gmail.com
>> <javascript:_e(%7B%7D,'cvml','if05041@gmail.com');>> wrote:
>>
>>> Hi All,
>>>
>>> We have a streaming computation that required to validate the data
>>> stream against the model provided by the user.
>>>
>>> Right now what I have done is to load the model into flink operator and
>>> then validate against it. However the model can be updated and changed
>>> frequently. Fortunately we always publish this event to RabbitMQ.
>>>
>>> I think we can
>>>
>>>
>>>    1. Create RabbitMq listener for model changed event from inside the
>>>    operator, then update the model if event arrived.
>>>
>>>    But i think this will create race condition if not handle correctly
>>>    and it seems odd to keep this
>>>
>>>    2. We can move the model into external in external memory cache
>>>    storage and keep the model up to date using flink. So the operator will
>>>    retrieve that from memory cache
>>>
>>>    3. Create two stream and using co operator for managing the shared
>>>    state.
>>>
>>>
>>> What is your suggestion on keeping the state up to date from external
>>> event ? Is there some kind of best practice for maintaining model up to
>>> date on streaming operator ?
>>>
>>> Thanks a lot
>>>
>>>
>>> Cheers
>>>
>>>
>>> --
>>> Welly Tambunan
>>> Triplelands
>>>
>>> http://weltam.wordpress.com
>>> http://www.triplelands.com <http://www.triplelands.com/blog/>
>>>
>>
>
>
> --
> Welly Tambunan
> Triplelands
>
> http://weltam.wordpress.com
> http://www.triplelands.com <http://www.triplelands.com/blog/>
>

Mime
View raw message