Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 3E531200B5E for ; Wed, 10 Aug 2016 20:43:00 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 3CF7A160AA4; Wed, 10 Aug 2016 18:43:00 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 31F32160A8F for ; Wed, 10 Aug 2016 20:42:59 +0200 (CEST) Received: (qmail 15454 invoked by uid 500); 10 Aug 2016 18:42:53 -0000 Mailing-List: contact user-help@flink.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@flink.apache.org Delivered-To: mailing list user@flink.apache.org Received: (qmail 15444 invoked by uid 99); 10 Aug 2016 18:42:53 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 10 Aug 2016 18:42:53 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id DF56B1A5C8F for ; Wed, 10 Aug 2016 18:42:52 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.067 X-Spam-Level: * X-Spam-Status: No, score=1.067 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-1.426, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, URI_HEX=1.313] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=yahoo.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id F7IBlFc7RZUv for ; Wed, 10 Aug 2016 18:42:50 +0000 (UTC) Received: from nm14-vm3.bullet.mail.ne1.yahoo.com (nm14-vm3.bullet.mail.ne1.yahoo.com [98.138.91.144]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 5A3AE5FE5A for ; Wed, 10 Aug 2016 18:42:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.com; s=s2048; t=1470854562; bh=3pz1EGMeNcJ+OO3bI6I/2C3BK8r44UyJgaMjeJftGsM=; h=Date:From:Reply-To:To:In-Reply-To:References:Subject:From:Subject; b=eODvzhMa82NNADPCj41im7Pc9M8K8FHFSjw9lp6Z42ByKiyAnuwt9XSAHBLnalTp5HFmpF4cC1QwYDZor0DiAvOv00XHdkgdk7q9JyrcGMQwo645gzLhPs3IPgp10pprORgv2dybJVA0NziOw/S1MiPkg6f7ZYJMuh6H/9ieSr5SwUMEniz88bd/WZy+IF7pAjaicbXPgjImJvN983oBGN1zU2linqBY1SRFuycRRxS4UIw+VYYXrn4qzEC8zNJJv8Gd7y72KSRDTGq1cmizm1bbExIVLpFV5AabIFZ1Ex38Li/3YYJQF97kchsbsTRxClAfF//T7U1t4r5ah4Cjkw== Received: from [98.138.100.111] by nm14.bullet.mail.ne1.yahoo.com with NNFMP; 10 Aug 2016 18:42:42 -0000 Received: from [98.138.226.163] by tm100.bullet.mail.ne1.yahoo.com with NNFMP; 10 Aug 2016 18:42:42 -0000 Received: from [127.0.0.1] by omp1064.mail.ne1.yahoo.com with NNFMP; 10 Aug 2016 18:42:42 -0000 X-Yahoo-Newman-Property: ymail-3 X-Yahoo-Newman-Id: 398674.82501.bm@omp1064.mail.ne1.yahoo.com X-YMail-OSG: wE_dfO4VM1mGPXgNn1xkjtYLaU0_FA9Pvi0giojGqeB1DjzD9tFNdA3winkjxF2 LUXE3K0myz9SgUHa8xjh3XVw1hdxA6IubCZZ6n4qH3SWjhOijHk0Z6JjvSN2zZAJdd0kbxV2HFTE 7vZ9xZib90sV0V7_IbK8Vg2h8bEHX28HDPCWAI05Jdkf7PDP9d9VhfgrnU3FkLg6hX9vIfDdjVya jAbnr7ooY0M20UwUPOSH.xJuhE3b9j9oqOdvN6RuevoJnDjicMb.dUx_X67T4GBeLzY9s68IW.rt 2GR.jUpCPi0t0f1kZvPGJPhjq6E4nZFGj2KxFpypVyPl2iP223MPid2esWWQG_7zfBwOYW2ImrpO UaHvEJtgClNm2vxAFh949ffftMljZBtpaGdLu2Y4liw.C1OOC6xEnTGDR7KNO.c63AACFZQK4ahB y99s8bFUzhDJ50yutEPiFXcfoSbAPzTVwXPSx6Nll0GdBXzqiGgMf.5rnGAQE.ERpTn2OHBRJuDb BvFtmBndtqqmsjJBnqrfYqNQ- Received: from jws10090.mail.ne1.yahoo.com by sendmailws169.mail.ne1.yahoo.com; Wed, 10 Aug 2016 18:42:42 +0000; 1470854562.006 Date: Wed, 10 Aug 2016 18:42:41 +0000 (UTC) From: M Singh Reply-To: M Singh To: "user@flink.apache.org" Message-ID: <1525102186.12343098.1470854561508.JavaMail.yahoo@mail.yahoo.com> In-Reply-To: References: <1938240731.11750746.1470759471974.JavaMail.yahoo.ref@mail.yahoo.com> <1938240731.11750746.1470759471974.JavaMail.yahoo@mail.yahoo.com> <1246291191.12166003.1470764444539.JavaMail.yahoo@mail.yahoo.com> <7BBE4D03-D8E8-4A38-9E7F-600BB7CF8D71@axiomine.com> <476161378.11590269.1470771921562.JavaMail.yahoo@mail.yahoo.com> Subject: Re: Flink : CEP processing MIME-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_Part_12343097_1901334837.1470854561499" archived-at: Wed, 10 Aug 2016 18:43:00 -0000 ------=_Part_12343097_1901334837.1470854561499 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Thanks for the pointers Sameer. The reason I wanted to find out about snapshotting with CEP is because I th= ought that CEP state might also be snapshotted for recovery. If that is the= case, then there are events in the CEP might be in two snapshots. Mans=20 On Tuesday, August 9, 2016 1:15 PM, Sameer W wrot= e: =20 In one of the earlier thread Till explained this to me (http://apache-flin= k-user-mailing-list-archive.2336050.n4.nabble.com/CEP-and-Within-Clause-td8= 159.html) 1. Within does not use time windows. It sort of uses session windows where = the session begins when the first event of the pattern is identified. The t= imer starts when the "first" event in the pattern fires. If the pattern com= pletes "within" the designated times (meaning the "next" and "followed by" = fire as will "within" the time specified) you have a match or else the wind= ow is removed. I don't know how it is implemented but I doubt it stores all= the events in memory for the "within" window (there is not need to). It wi= ll only store the relevant events (first, next, followed by, etc). So memor= y would not be an issue here. If two "first" type events are identified I t= hink two "within" sessions are created. 2. Snapshotting (I don't know much in this area so I cannot answer). Why sh= ould it be different though? You are using operators and state. It should w= ork the same way. But I am not too familiar with that. 3. The "Within" window is not an issue. Even the window preceding that shou= ld not be unless you are using WindowFunction (more memory friendly alterna= tive is=C2=A0https://ci.apache.org/ projects/flink/flink-docs- master/apis/= streaming/windows. html#window-functions=C2=A0)=C2=A0by themselves and usin= g a really large window 4. The way I am using it, it is working fine. Some of the limitations I hav= e seen are related to this paper not being fully implemented (https://peopl= e.cs.umass.edu/ ~yanlei/publications/sase- sigmod08.pdf). I don't know how = to support negation in an event stream but I don't need it for now. Thanks,Sameer On Tue, Aug 9, 2016 at 3:45 PM, M Singh wrote: Hi Sameer: If we use a within window for event series -=C2=A0 1. Does it interfere with the default time windows ?2. How does it affect s= napshotting ? =C2=A03. If the window is too large are the events stored in = a "processor" for the window to expire ?4. Are there any other know limitat= ions and best practices of using CEP with Flink ? Thanks again for your help. =20 On Tuesday, August 9, 2016 11:29 AM, Sameer Wadkar wrote: =20 In that case you need to get them into one stream somehow (keyBy a dummy v= alue for example). There is always some logical key to keyBy on when data i= s arriving from multiple sources (ex some portion of the time stamp).=C2=A0 You are looking for patterns within something (events happening around the = same time but arriving from multiple devices). That something should be the= key. That's how I am using it.=C2=A0 Sameer Sent from my iPhone On Aug 9, 2016, at 1:40 PM, M Singh wrote: Thanks Sameer. So does that mean that if the events keys are not same we cannot use the CE= P pattern match ?=C2=A0 What if events are coming from different sources an= d need to be correlated ? Mans=20 On Tuesday, August 9, 2016 9:40 AM, Sameer W wrot= e: =20 Hi, You will need to use keyBy operation first to get all the events you need m= onitored in a pattern on the same node. Only then can you apply Pattern bec= ause it depends on the order of the events (first, next, followed by). I ev= en had to make sure that the events were correctly sorted by timestamps to = ensure that the first,next and followed by works correctly. Sameer On Tue, Aug 9, 2016 at 12:17 PM, M Singh wrote: Hey Folks: I have a question about CEP processing in Flink - How does flink processing= work when we have multiple partitions in which the events used in the patt= ern sequence might be scattered across multiple partitions on multiple node= s ? Thanks for your insight. Mans =20 =20 ------=_Part_12343097_1901334837.1470854561499 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable
Thanks for the pointers = Sameer.
=
The reason I wanted to find out about= snapshotting with CEP is because I thought that CEP state might also be sn= apshotted for recovery. If that is the case, then there are events in the C= EP might be in two snapshots.

Mans<= /div>


=
On Tuesday, August 9, 2016 1:15 PM, Sameer W <sam= eer@axiomine.com> wrote:


In one of the ea= rlier thread Till explained this to me (http://apache-flink-u= ser-mailing-list-archive.2336050.n4.nabble.com/CEP-and-Within-Clause-td8159= .html)

1. Within does not use time wi= ndows. It sort of uses session windows where the session begins when the fi= rst event of the pattern is identified. The timer starts when the "first" e= vent in the pattern fires. If the pattern completes "within" the designated= times (meaning the "next" and "followed by" fire as will "within" the time= specified) you have a match or else the window is removed. I don't know ho= w it is implemented but I doubt it stores all the events in memory for the = "within" window (there is not need to). It will only store the relevant eve= nts (first, next, followed by, etc). So memory would not be an issue here. = If two "first" type events are identified I think two "within" sessions are= created.

2. Snapshotting (I don't = know much in this area so I cannot answer). Why should it be different thou= gh? You are using operators and state. It should work the same way. But I a= m not too familiar with that.

3. Th= e "Within" window is not an issue. Even the window preceding that should no= t be unless you are using WindowFunction (more memory friendly alternative = is https://ci.apache.org/ proj= ects/flink/flink-docs- master/apis/streaming/windows. html#window-functions=  ) by themselves and using a really large window
4. The way I am using it, it is working fine. S= ome of the limitations I have seen are related to this paper not being full= y implemented (https://people.cs.umass.edu/ ~yanlei/publications/sa= se- sigmod08.pdf). I don't know how to support negation in an event str= eam but I don't need it for now.

Th= anks,
Sameer


On Tue, Aug 9, 2016 at 3:45 PM, M Singh <mans2singh@yahoo.com> wrote:
Hi Sameer:

If we use a within window for event series - 

= 1. Does it interfere with the default time windows ?
2. How does it affect snapshotting ?  
=
3. If the window is too large are the events stored = in a "processor" for the window to expire ?
4. Are there any other know limitations and best practices of using CEP= with Flink ?

=
Thanks again for your help.
<= div class=3D"yiv9022891051h5">





<= div>
In that case you need to get them into one stream somehow (ke= yBy a dummy value for example). There is always some logical key to keyBy o= n when data is arriving from multiple sources (ex some portion of the time = stamp). 

You are looking for p= atterns within something (events happening around the same time but arrivin= g from multiple devices). That something should be the key. That's how I am= using it. 

Sameer

Sent from my iPhone
Thanks Sameer.=

So does = that mean that if the events keys are not same we cannot use the CEP patter= n match ?  What if events are coming from different sources and need t= o be correlated ?

Mans


=
On Tuesday, August 9, 2016 9:40 AM, Sameer W <sameer@axiomine.com>= wrote:


Hi,

= You will need to use keyBy operation first to get all the events you need m= onitored in a pattern on the same node. Only then can you apply Pattern bec= ause it depends on the order of the events (first, next, followed by). I ev= en had to make sure that the events were correctly sorted by timestamps to = ensure that the first,next and followed by works correctly.

Sameer

On Tue, Aug 9, 2016 at 12:17 PM, M Singh <mans2singh@yahoo.com> wrote:
Hey Folks:
I have a question about CEP processing in Flink= - How does flink processing work when we have multiple partitions in which= the events used in the pattern sequence might be scattered across multiple= partitions on multiple nodes ?

Thanks for you= r insight.

Mans






<= /div>


------=_Part_12343097_1901334837.1470854561499--