flink-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kostas Kloudas <k.klou...@data-artisans.com>
Subject Re: CEP and KeyedStreams doubt
Date Thu, 26 Jan 2017 14:19:51 GMT
Hi Oriol,

The number of keys is related to the number of data-structures (NFAs) Flink is going to create
and keep.
Given this, it may make sense to try to reduce your key-space (or your keyedStreams). Other
than that, Flink
has not issue handling large numbers of keys.

Now, for the issue you mentioned, we hope to get it fixed soon but there is no concrete horizon
yet.

Hope this helps!

Let us know if you have any issues,
Kostas

> On Jan 26, 2017, at 1:04 PM, Oriol <orioloq@gmail.com> wrote:
> 
> Hello everyone, 
>  
> I'm using the CEP library for event stream processing. 
>  
> I'm splitting the dataStream into different KeyedStreams using keyBy(). In the KeyBy,
I'm using a tuple of two elements, which means I may have several millions of KeyedStreams,
as I need to monitor all our customer's users. 
>  
> Is this the preferred way to use Flink, or should I find a way to reduce the number of
KeyedStreams, for example having one per customer instead of one per customer's user? (And
find a way later to process each user by itself).
>  
> Also, is the bug reported in https://issues.apache.org/jira/browse/FLINK-5174 <https://issues.apache.org/jira/browse/FLINK-5174>
related to keys of the KeyedStreams? I'm not sure what kind of keys it is related to. If so,
is it going to be addressed soon?
> 
> Thanks,
> Oriol.


Mime
View raw message