flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aljoscha Krettek <aljos...@apache.org>
Subject Re: [DISCUSS] FLIP-9: Trigger DSL
Date Wed, 17 Aug 2016 09:57:33 GMT
Regarding Repeat.forever() and the default being to not repeat. The simple
reason is that Beam (née Google Dataflow) provides basically the same thing
with their trigger DSL and that their triggers behave like this. I think it
would not be beneficial to have the same feature in two systems in that
space where the behavior is the opposite. That would make it confusing for
users.

On the implementation side, I think in most cases you need to have a way of
telling when triggers are finished or not anyways. There could be a central
component in the TriggerRunner that has a finished bit for every trigger in
the tree. In most cases this would be a simple byte. Triggers could set and
query this finished bit. In some cases, where you know that triggers can
never finish you could have a dummy implementation of the finished set that
does not store any state and always returns false when queried.

On Wed, 17 Aug 2016 at 11:52 Aljoscha Krettek <aljoscha@apache.org> wrote:

> Kostas already nicely explained this!
>
> I just want to give some theoretical background. I see the underlying idea
> of triggers similar to predicates, i.e.
> "EventTimeTrigger.afterEndOfWindow().withEarlyTrigger(earlyFiringTrigger)"
> translates to a predicate "(E and ET) or WT" (where E is a predicate that
> is true when we are in early phase, ET is the early trigger and WT is the
> watermark trigger). The other trigger translates to "(!E and LT) or WT",
> i.e. it triggers if we're not early and LT is true or if the watermark
> trigger is true. If we combine the two we get:
>
> ((E and ET) or WT) and ((!E and LT) or WT)
>
> now we can eliminate the two parts with E and !E because they can never be
> true and are in an "or":
>
> WT and WT
>
> which yield just "WT".
>
> Hope that makes sense to you.
>
> Cheers,
> Aljoscha
>
>
> On Wed, 17 Aug 2016 at 10:47 Kostas Kloudas <k.kloudas@data-artisans.com>
> wrote:
>
>> Hello Jark Wu,
>>
>> Both of them will work in the new DSL. The idea is that there should be no
>> restrictions on the combinations one can do.
>>
>> Coming to what does the early and the late trigger do, the early trigger
>> will
>> be responsible for specifying when the trigger should fire in the period
>> between
>> the beginning of the window and the time when the watermark passes the end
>> of the window. The late trigger takes over after the watermark passes the
>> end of
>> the window, and specifies when the trigger should fire in the period
>> between the
>> endOfWindow and endOfWindow + allowedLateness.
>>
>> So in the case of the:
>>         All(EventTimeTrigger.afterEndOfWindow()
>>                                 .withEarlyTrigger(earlyFiringTrigger),
>>                  EventTimeTrigger.afterEndOfWindow()
>>                                 .withLateTrigger(lateFiringTrigger))
>>
>> The trigger will only fire at the end of the window, as this is the only
>> time both
>> triggers will say FIRE.
>>
>> Although the above will work, the example that you gave is a nice one as
>> it
>> degenerates to an:
>>
>>         EventTimeTrigger.afterEndOfWindow()
>>
>> Detecting this and giving the simplest trigger for the job can lead to
>> further
>> optimizations, as it can for example reduce the amount of state the
>> trigger has to keep.
>>
>> That would actually be a very nice addition to have as in some cases it
>> can lead
>> to performance improvements.
>>
>> Thanks for the feedback!
>>
>> Kostas
>>
>> > On Aug 17, 2016, at 4:36 AM, Jark Wu <wuchong.wc@alibaba-inc.com>
>> wrote:
>> >
>> > Hi,
>> >
>> > It’s a cool design, I really like it !  I have two questions here.
>> >
>> > The first is whether do we have the complex composite triggers, i.e.
>> nested All and Any. Such as :
>> >
>> > Any(
>> >   All(trigger1, trigger2),
>> >   Any(trigger3, trigger4)
>> > )
>> >
>> > Can the above code work?
>> >
>> > Another question is : In composite triggers, what’s the behavior of
>> withEarlyTrigger and withLateTrigger ? For example,
>> >
>> > All(EventTimeTrigger.afterEndOfWindow()
>> >                                 .withEarlyTrigger(earlyFiringTrigger),
>> >     EventTimeTrigger.afterEndOfWindow()
>> >                                 .withLateTrigger(lateFiringTrigger))
>> >
>> > Is it legal? Will the earlyFiringTrigger and lateFiringTrigger both
>> work  ?
>> >
>> >
>> > - Jark Wu
>> >
>> >> 在 2016年8月17日,上午12:24,Kostas Kloudas <k.kloudas@data-artisans.com>
写道:
>> >>
>> >> Hi Aljoscha,
>> >>
>> >> Thanks for the feedback!
>> >>
>> >> It is a nice feature to have. The reason it is not included in the FLIP
>> >> is that I have not seen somebody asking for something similar in the
>> >> mailing list.
>> >>
>> >> A point that I have to add is that it seems (from the user ML) that
>> >> most of the times users expect the “Repeated.forever” behavior to
>> >> be the default.
>> >>
>> >> Given this, I would say that we should make this the default and
>> >> add something like “Repeat.Once” option which will just let the trigger
>> >> fire once, e.g. the first time the counter reaches 5 in your example,
>> >> and then stop.
>> >>
>> >> In other case, the trigger specification may become too verbose,
>> >> as the user will have to write the “Repeat.forever” for all child
>> triggers.
>> >>
>> >> What do you think?
>> >>
>> >> Kostas
>> >>
>> >>> On Aug 16, 2016, at 4:38 PM, Aljoscha Krettek <aljoscha@apache.org>
>> wrote:
>> >>>
>> >>> Ah, I just read the document again and noticed that it might be good
>> to
>> >>> differentiate between repeatable triggers and non-repeating triggers.
>> I'm
>> >>> proposing to make most triggers non-repeating with the addition of a
>> >>> trigger that makes other triggers repeatable.
>> >>>
>> >>> Example Non-Repeating:
>> >>> EventTimeTrigger.pastEndOfWindow()
>> >>> .withEarlyFiring(CountTrigger.of(5))
>> >>>
>> >>> this gives me an early firing once I got 5 elements and then an
>> on-time
>> >>> firing once the watermark passes the end of the window.
>> >>>
>> >>> Example with Repeating:
>> >>> EventTimeTrigger.pastEndOfWindow()
>> >>> .withEarlyFiring(Repeated.forever(CountTrigger.of(5)))
>> >>>
>> >>> this gives me early firings whenever I see 5 new elements plus the
>> >>> watermark firing.
>> >>>
>> >>> What do you think?
>> >>>
>> >>> On Tue, 16 Aug 2016 at 15:31 Kostas Kloudas <
>> k.kloudas@data-artisans.com>
>> >>> wrote:
>> >>>
>> >>>> Thanks Till!
>> >>>>
>> >>>> Kostas
>> >>>>
>> >>>>> On Aug 16, 2016, at 3:30 PM, Till Rohrmann <trohrmann@apache.org>
>> wrote:
>> >>>>>
>> >>>>> Cool design doc Klou. It's well described with a lot of details.
I
>> like
>> >>>> it
>> >>>>> a lot :-) +1 for implementing the trigger DSL.
>> >>>>>
>> >>>>> Cheers,
>> >>>>> Till
>> >>>>>
>> >>>>> On Tue, Aug 16, 2016 at 3:18 PM, Kostas Kloudas <
>> >>>> k.kloudas@data-artisans.com
>> >>>>>> wrote:
>> >>>>>
>> >>>>>> Thanks for the feedback Ufuk!
>> >>>>>> I will do that.
>> >>>>>>
>> >>>>>>> On Aug 16, 2016, at 1:41 PM, Ufuk Celebi <uce@apache.org>
wrote:
>> >>>>>>>
>> >>>>>>> Hey Kostas! Thanks for sharing the documents. I think
it makes
>> sense
>> >>>>>>> to merge the two documents by moving the Google doc
contents to
>> the
>> >>>>>>> Wiki. I think they form one unit.
>> >>>>>>>
>> >>>>>>> On Tue, Aug 16, 2016 at 12:34 PM, Kostas Kloudas
>> >>>>>>> <k.kloudas@data-artisans.com> wrote:
>> >>>>>>>> Hi all!
>> >>>>>>>>
>> >>>>>>>> I've created a FLIP for the trigger DSL. This is
the triggers
>> >>>>>>>> that we want Apache Flink to support out-of-the-box.
This
>> proposal
>> >>>>>>>> builds on various discussions on the mailing list
and aims at
>> >>>>>>>> serving as a base for further ones.
>> >>>>>>>>
>> >>>>>>>>
>> >>>>
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-9%3A+Trigger+DSL
>> >>>>>> <
>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-9:+Trigger+DSL>
>> >>>>>>>>
>> >>>>>>>> FLIP-9 provides a description of the triggers Flink
already
>> offers,
>> >>>>>>>> the new that we think should be added, how the APIs
could look
>> like,
>> >>>>>>>> some discussion on the implementation implications
and some ideas
>> >>>>>>>> on how to implement them.
>> >>>>>>>>
>> >>>>>>>> There is also a shared document giving a bit more
insight on the
>> >>>>>> implementation
>> >>>>>>>> implications. Feel free to read but please keep
the discussion
>> in the
>> >>>>>> mailing list.
>> >>>>>>>>
>> >>>>>>>> https://docs.google.com/a/data-artisans.com/document/d/
>> >>>>>> 1vESGQ913oR-DnE1jmFiihvLBU6_UDo-1DRgoHtSgu30/edit?usp=sharing
<
>> >>>>>> https://docs.google.com/a/data-artisans.com/document/d/
>> >>>>>> 1vESGQ913oR-DnE1jmFiihvLBU6_UDo-1DRgoHtSgu30/edit?usp=sharing>
>> >>>>>>>>
>> >>>>>>>> I would like to start working on an the implementation
next week.
>> >>>>>>>>
>> >>>>>>>> Let the discussion begin!
>> >>>>>>>>
>> >>>>>>>> Kostas
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>
>> >>>>
>> >
>>
>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message