flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ladhari Sadok <laadhari.sa...@gmail.com>
Subject Re: Broadcast to all the other operators
Date Thu, 09 Nov 2017 10:22:34 GMT
Ok thanks Tony, your answer is very helpful.

2017-11-09 11:09 GMT+01:00 Tony Wei <tony19920430@gmail.com>:

> Hi Sadok,
>
> The sample code is just an example to show you how to broadcast the rules
> to all subtasks, but the output from CoFlatMap is not necessary to be
> Tuple2<Rule, Record>. It depends on what you actually need in your Rule
> Engine project.
> For example, if you can apply rule on each record directly, you can emit
> processed records to keyed operator.
> IMHO, the scenario in the article you mentioned is having serval
> well-prepared rules to enrich data, and using DSL files to decide what
> rules that incoming event needs. After enriching, the features for the
> particular event will be grouped by its random id and be calculated by the
> models.
> I think this approach might be close to the solution in that article, but
> it could have some difference according to different use cases.
>
> Best Regards,
> Tony Wei
>
>
> 2017-11-09 17:27 GMT+08:00 Ladhari Sadok <laadhari.sadok@gmail.com>:
>
>>
>> ---------- Forwarded message ----------
>> From: Ladhari Sadok <laadhari.sadok@gmail.com>
>> Date: 2017-11-09 10:26 GMT+01:00
>> Subject: Re: Broadcast to all the other operators
>> To: Tony Wei <tony19920430@gmail.com>
>>
>>
>> Thanks Tony for your very fast answer ,
>>
>> Yes it resolves my problem that way, but with flatMap I will get
>> Tuple2<Rule, Record> always in the processing function (<NULL ,Record>
in
>> case of no rules update available and <newRule,Record> in the other case ).
>> There is no optimization of this solution ? Do you think it is the same
>> solution in this picture : https://data-artisans.com/wp-c
>> ontent/uploads/2017/10/streaming-in-definitions.png ?
>>
>> Best regards,
>> Sadok
>>
>>
>> Le 9 nov. 2017 9:21 AM, "Tony Wei" <tony19920430@gmail.com> a écrit :
>>
>> Hi Sadok,
>>
>> What I mean is to keep the rules in the operator state. The event in Rule
>> Stream is just the change log about rules.
>> For more specific, you can fetch the rules from Redis in the open step of
>> CoFlatMap and keep them in the operator state, then use Rule Stream to
>> notify the CoFlatMap to 1. update some rules or 2. refetch all rules from
>> Redis.
>> Is that what you want?
>>
>> Best Regards,
>> Tony Wei
>>
>> 2017-11-09 15:52 GMT+08:00 Ladhari Sadok <laadhari.sadok@gmail.com>:
>>
>>> Thank you for the answer, I know that solution, but I don't want to
>>> stream the rules all time.
>>> In my case I have the rules in Redis and at startup of flink they are
>>> loaded.
>>>
>>> I want to broadcast changes just when it occurs.
>>>
>>> Thanks.
>>>
>>> Le 9 nov. 2017 7:51 AM, "Tony Wei" <tony19920430@gmail.com> a écrit :
>>>
>>>> Hi Sadok,
>>>>
>>>> Since you want to broadcast Rule Stream to all subtasks, it seems that
>>>> it is not necessary to use KeyedStream.
>>>> How about use broadcast partitioner, connect two streams to attach the
>>>> rule on each record or imply rule on them directly, and do the key operator
>>>> after that?
>>>> If you need to do key operator and apply the rules, it should work by
>>>> changing the order.
>>>>
>>>> The code might be something like this, and you can change the rules'
>>>> state in the CoFlatMapFunction.
>>>>
>>>> DataStream<Rule> rules = ...;
>>>> DataStream<Record> records = ...;
>>>> DataStream<Tuple2<Rule, Record>> recordWithRule =
>>>> rules.broadcast().connect(records).flatMap(...);
>>>> dataWithRule.keyBy(...).process(...);
>>>>
>>>> Hope this will make sense to you.
>>>>
>>>> Best Regards,
>>>> Tony Wei
>>>>
>>>> 2017-11-09 6:25 GMT+08:00 Ladhari Sadok <laadhari.sadok@gmail.com>:
>>>>
>>>>> Hello,
>>>>>
>>>>> I'm working on Rules Engine project with Flink 1.3, in this project I
>>>>> want to update some keyed operator state when external event occurred.
>>>>>
>>>>> I have a Datastream of updates (from kafka) I want to broadcast the
>>>>> data contained in this stream to all keyed operator so I can change the
>>>>> state in all operators.
>>>>>
>>>>> It is like this use case :
>>>>> Image : https://data-artisans.com/wp-content/uploads/2017/10/streami
>>>>> ng-in-definitions.png
>>>>> All article : https://data-artisans.com/blog
>>>>> /real-time-fraud-detection-ing-bank-apache-flink
>>>>>
>>>>> I founded it in the DataSet API but not in the DataStream API !
>>>>>
>>>>> https://ci.apache.org/projects/flink/flink-docs-release-1.3/
>>>>> dev/batch/index.html#broadcast-variables
>>>>>
>>>>> Can some one explain to me who to solve this problem ?
>>>>>
>>>>> Thanks a lot.
>>>>>
>>>>> Flinkly regards,
>>>>> Sadok
>>>>>
>>>>
>>>>
>>
>>
>>
>

Mime
View raw message