flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jark Wu <j...@apache.org>
Subject Re: [DISCUSS]: Integrating Flink Table API & SQL with CEP
Date Thu, 08 Jun 2017 06:57:52 GMT
Hi  @Kostas, @Fabian, thank you for your support.

@Fabian, I totally agree with you that we should focus on SQL first. Let's
keep Table API in mind and discuss that later.

Regarding to the orderBy() clause, I'm not sure about that. I think it
makes sense to make it required in streaming mode(either order by rowtime
or order by proctime). But CEP also works in batch mode, and not necessary
to order by some column. Nevertheless, we can support CEP on batch SQL

We are estimating how to implement MATCH_RECOGNIZE with CEP library (with
NFA, CEP operator). And we will output a detailed doc and a prototype in
the next days.

Jark Wu

2017-06-07 21:40 GMT+08:00 Fabian Hueske <fhueske@gmail.com>:

> Thanks Dian and Jark for this proposal!
> As you wrote, Till and I (and Kostas) have been thinking about this for
> some time but haven't had time to work on this feature.
> I think it would be a great addition and value add for Flink's SQL support
> and Table API.
> I read the proposal and think it is very good. We might need to add a bit
> more details, esp. when planning the concrete steps of the implementation.
> A few comments to the proposal:
> - IMO, the development should start focusing on SQL and its semantics.
> Pattern support for the Table API should be added later. We followed that
> approach for the OVER windows and I think it worked quiet well.
> - We probably want to reuse as much as possible from the CEP library. That
> means we need to check if the semantics of the CEP library and Oracle's
> PATTERN syntax are aligned (or how we can express the PATTERN semantics
> with the CEP library). This should be one of the first steps, IMO.
> - I would make the orderBy() clause required. In regular SQL rows have no
> order, so we need to make that explicit (this would also be consistent with
> the OVER windows).
> Let me know what you think.
> Best, Fabian
> 2017-06-07 11:41 GMT+02:00 Kostas Kloudas <k.kloudas@data-artisans.com>:
>> Thanks a lot for opening the discussion!
>> This is a really interesting idea that has been in our heads
>> since the first implementation of the CEP library.
>> A big +1 for moving forward with this.
>> And as for the design document, I will definitely have a look
>> and comment there.
>> Kostas
>> On Jun 7, 2017, at 10:05 AM, Jark Wu <jark@apache.org> wrote:
>> Sorry, I forgot to cc you guys @Fabian, @Timo, @Till, @Kostas
>> 2017-06-07 15:42 GMT+08:00 Jark Wu <jark@apache.org>:
>>> Hi devs,
>>> Dian and me and our teammates have investigated this for a long time. We
>>> think consolidating Flink SQL and CEP is an exciting thing for Flink. It'll
>>> make SQL more powerful and give users the ability to easily and quickly
>>> build CEP applications.  And I find Flink community has also talked about
>>> this idea before, such as the mailing list [1] and [2] and Fabian & Till's
>>> talk in Flink Forward 2016 [3].
>>> I think THIS IS THE POINT to bring up this topic again. Because we
>>> already have pattern matching foundation in Flink CEP library, and Stream
>>> SQL is ready now and Calcite has partially supported pattern matching
>>> syntax!  We also drafted a design doc about how to integrate SQL and CEP,
>>> and how to support CEP on Table API. https://docs.google.com/docume
>>> nt/d/1HaaO5eYI1VZjyhtVPZOi3jVzikU7iK15H0YbniTnN30/edit?usp=sharing
>>> @Fabian, @Timo, @Till, @Kostas I include you into this discussion, it
>>> would be great to hear your response.
>>> What do others think?
>>> [1] http://apache-flink-mailing-list-archive.1008284.n3.nabble.c
>>> om/Add-CEP-library-to-Flink-td9743.html#a9787
>>> [2] http://apache-flink-mailing-list-archive.1008284.n3.nabble.c
>>> om/Effort-to-add-SQL-StreamSQL-to-Flink-td9727.html#a9790
>>> [3] https://www.slideshare.net/tillrohrmann/streaming-analytics-
>>> cep-two-sides-of-the-same-coin
>>> Regards,
>>> Jark Wu
>>> 2017-06-07 13:50 GMT+08:00 Dian Fu <dianfu@apache.org>:
>>>> Hi everyone,
>>>> Flink's CEP library is a great library for complex event processing,
>>>> more
>>>> and more customers are expressing their interests in it. But it also has
>>>> some limitations that users usually have to write a lot of code even
>>>> for a
>>>> very simple pattern match use case as it currently only supports the
>>>> Java
>>>> API.
>>>> We have investigated some popular CEP products such as esper [1] and
>>>> siddhi
>>>> [2] and found that most of these CEP products support SQL-like
>>>> expressions
>>>> such as EPL to describe the match pattern. But these solutions also have
>>>> the drawbacks that the pattern match languages are not standard SQL, the
>>>> learn curve is steep for users and it's impossible to integrate them
>>>> into
>>>> the Flink Table API & SQL.
>>>> We find that Oracle's CEP solution CQL [3] supports a new pattern
>>>> recognition clause match_recognize which is a pattern recognition clause
>>>> proposed in this paper [4]. It proposes a set of new syntaxes to define
>>>> match pattern in sql expression. Calcite already supports part of this
>>>> standard [5].  I think it will be of great value to support expressing
>>>> pattern recognition clause with match_recognize clause by integrating it
>>>> with Flink Table API & SQL and the Flink CEP library. Any thoughts?
>>>> [1] http://www.espertech.com
>>>> [2] https://github.com/wso2/siddhi
>>>> [3]
>>>> https://docs.oracle.com/middleware/1213/eventprocessing/cql-
>>>> reference/GUID-34D4968E-C55A-4BC7-B1CE-C84B202217BD.htm#CQLLR1531
>>>> [4]
>>>> http://web.cs.ucla.edu/classes/winter17/cs240B/notes/row-pat
>>>> tern-recogniton-11.pdf
>>>> [5] https://issues.apache.org/jira/browse/CALCITE-1570
>>>> Best Regards,
>>>> Dian

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message