flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Dian Fu <dian0511...@gmail.com>
Subject Re: Parallel CEP
Date Thu, 24 Jan 2019 07:12:56 GMT
I'm afraid you cannot do that. The inputs having the same key should be processed by the same
CEP operator. Otherwise the results will be nondeterministic and also be wrong.

Regards,
Dian

> 在 2019年1月24日,下午2:56,dhanuka ranasinghe <dhanuka.priyanath@gmail.com>
写道:
> 
> In this example key will be same. I am using 1 million messages with same key for performance
testing. But still I want to process them parallel. Can't I use Split function and get a SplitStream
for that purpose?
> 
> On Thu, Jan 24, 2019 at 2:49 PM Dian Fu <dian0511.fu@gmail.com <mailto:dian0511.fu@gmail.com>>
wrote:
> Hi Dhanuka,
> 
> Does the KeySelector of Event::getTriggerID generate the same key for all the inputs
or only generate very few key values and these key values happen to be hashed to the same
downstream operator? You can print the results of Event::getTriggerID to check if it's that
case.
> 
> Regards,
> Dian
> 
>> 在 2019年1月24日,下午2:08,dhanuka ranasinghe <dhanuka.priyanath@gmail.com
<mailto:dhanuka.priyanath@gmail.com>> 写道:
>> 
>> Hi Dian,
>> 
>> Thanks for the explanation. Please find the screen shot and source code for above
mention use case. And in main issue is though I use KeyedStream , parallelism not apply properly.
>> Only one host is processing messages.
>> 
>> <Flink-CEP-Keyby.png>
>> 
>> Cheers,
>> Dhanuka
>> 
>> On Thu, Jan 24, 2019 at 1:40 PM Dian Fu <dian0511.fu@gmail.com <mailto:dian0511.fu@gmail.com>>
wrote:
>> Whether using KeyedStream depends on the logic of your job, i.e, whether you are
looking for patterns for some partitions, i.e, patterns for a particular user. If so, you
should partition the input data before the CEP operator. Otherwise, the input data should
not be partitioned.
>> 
>> Regards,
>> Dian 
>> 
>>> 在 2019年1月24日,下午12:37,dhanuka ranasinghe <dhanuka.priyanath@gmail.com
<mailto:dhanuka.priyanath@gmail.com>> 写道:
>>> 
>>> Hi Dian,
>>> 
>>> I tried that but then kafkaproducer only produce to single partition and only
single flink host working while rest not contribute for processing . I will share the code
and screenshot
>>> 
>>> Cheers 
>>> Dhanuka
>>> 
>>> On Thu, 24 Jan 2019, 12:31 Dian Fu <dian0511.fu@gmail.com <mailto:dian0511.fu@gmail.com>
wrote:
>>> Hi Dhanuka,
>>> 
>>> In order to make the CEP operator to run parallel, the input stream should be
KeyedStream. You can refer [1] for detailed information.
>>> 
>>> Regards,
>>> Dian
>>> 
>>> [1]: https://ci.apache.org/projects/flink/flink-docs-master/dev/libs/cep.html#detecting-patterns
<https://ci.apache.org/projects/flink/flink-docs-master/dev/libs/cep.html#detecting-patterns>
>>> 
>>> > 在 2019年1月24日,上午10:18,dhanuka ranasinghe <dhanuka.priyanath@gmail.com
<mailto:dhanuka.priyanath@gmail.com>> 写道:
>>> > 
>>> > Hi All,
>>> > 
>>> > Is there way to run CEP function parallel. Currently CEP run only sequentially
>>> > 
>>> > <flink-CEP.png>
>>> > .
>>> > 
>>> > Cheers,
>>> > Dhanuka
>>> > 
>>> > -- 
>>> > Nothing Impossible,Creativity is more important than knowledge.
>>> 
>> 
>> 
>> 
>> -- 
>> Nothing Impossible,Creativity is more important than knowledge.
>> <FlinkCEP.java>
> 
> 
> 
> -- 
> Nothing Impossible,Creativity is more important than knowledge.


Mime
View raw message