flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Aljoscha Krettek <aljos...@apache.org>
Subject Re: [DISCUSS] FLIP-9: Trigger DSL
Date Wed, 17 Aug 2016 10:12:43 GMT
Hi,
I think that would blow up state since there can be several triggers that
need this kind of state, Any and All come to mind, possibly. If each of
those keeps state that's at least a byte per trigger. If the finished state
were kept centrally by the TriggerRunner it would just be one byte for
everything, in most cases.

As I said, in some cases keeping that extra bit can be avoided. For
example, if you have Repeat.forever(Some.trigger()) you know that the
finished bit will always be false and so you don't keep any state in the
TriggerRunner. If every trigger manually does that bookkeeping you remove
that possibility while increasing complexity in each Trigger implementation.

Cheers,
Aljoscha

On Wed, 17 Aug 2016 at 12:05 Kostas Kloudas <k.kloudas@data-artisans.com>
wrote:

> Hi Aljoscha,
>
> On the Repeat.? addition, I think that each trigger will have to have
> its own implementation, e.g. the CountTrigger should just set a dummy
> value in the counter in order to know if it should fire again or not.
>
> In other case, we will have to add more state and this can lead to
> significant
> performance degradation, as in most cases this state has to be checked on
> every element.
>
> Another potential solution, which I am not sure if it covers all cases,
> could
> be to have a State abstraction like CompositeState, apart from the
> Value, List, Reduce, Fold, which can fetch more than one types of state
> with one round trip to the backend. Imagine having the “counter" and the
> “canceled” states in the same entry in the backend and always fetch them
> together. This can lead to zero additional cost for the extra state.
>
> What do you think?
>
> Kostas
>
> > On Aug 17, 2016, at 11:57 AM, Aljoscha Krettek <aljoscha@apache.org>
> wrote:
> >
> > Regarding Repeat.forever() and the default being to not repeat. The
> simple
> > reason is that Beam (née Google Dataflow) provides basically the same
> thing
> > with their trigger DSL and that their triggers behave like this. I think
> it
> > would not be beneficial to have the same feature in two systems in that
> > space where the behavior is the opposite. That would make it confusing
> for
> > users.
> >
> > On the implementation side, I think in most cases you need to have a way
> of
> > telling when triggers are finished or not anyways. There could be a
> central
> > component in the TriggerRunner that has a finished bit for every trigger
> in
> > the tree. In most cases this would be a simple byte. Triggers could set
> and
> > query this finished bit. In some cases, where you know that triggers can
> > never finish you could have a dummy implementation of the finished set
> that
> > does not store any state and always returns false when queried.
> >
> > On Wed, 17 Aug 2016 at 11:52 Aljoscha Krettek <aljoscha@apache.org>
> wrote:
> >
> >> Kostas already nicely explained this!
> >>
> >> I just want to give some theoretical background. I see the underlying
> idea
> >> of triggers similar to predicates, i.e.
> >>
> "EventTimeTrigger.afterEndOfWindow().withEarlyTrigger(earlyFiringTrigger)"
> >> translates to a predicate "(E and ET) or WT" (where E is a predicate
> that
> >> is true when we are in early phase, ET is the early trigger and WT is
> the
> >> watermark trigger). The other trigger translates to "(!E and LT) or WT",
> >> i.e. it triggers if we're not early and LT is true or if the watermark
> >> trigger is true. If we combine the two we get:
> >>
> >> ((E and ET) or WT) and ((!E and LT) or WT)
> >>
> >> now we can eliminate the two parts with E and !E because they can never
> be
> >> true and are in an "or":
> >>
> >> WT and WT
> >>
> >> which yield just "WT".
> >>
> >> Hope that makes sense to you.
> >>
> >> Cheers,
> >> Aljoscha
> >>
> >>
> >> On Wed, 17 Aug 2016 at 10:47 Kostas Kloudas <
> k.kloudas@data-artisans.com>
> >> wrote:
> >>
> >>> Hello Jark Wu,
> >>>
> >>> Both of them will work in the new DSL. The idea is that there should
> be no
> >>> restrictions on the combinations one can do.
> >>>
> >>> Coming to what does the early and the late trigger do, the early
> trigger
> >>> will
> >>> be responsible for specifying when the trigger should fire in the
> period
> >>> between
> >>> the beginning of the window and the time when the watermark passes the
> end
> >>> of the window. The late trigger takes over after the watermark passes
> the
> >>> end of
> >>> the window, and specifies when the trigger should fire in the period
> >>> between the
> >>> endOfWindow and endOfWindow + allowedLateness.
> >>>
> >>> So in the case of the:
> >>>        All(EventTimeTrigger.afterEndOfWindow()
> >>>                                .withEarlyTrigger(earlyFiringTrigger),
> >>>                 EventTimeTrigger.afterEndOfWindow()
> >>>                                .withLateTrigger(lateFiringTrigger))
> >>>
> >>> The trigger will only fire at the end of the window, as this is the
> only
> >>> time both
> >>> triggers will say FIRE.
> >>>
> >>> Although the above will work, the example that you gave is a nice one
> as
> >>> it
> >>> degenerates to an:
> >>>
> >>>        EventTimeTrigger.afterEndOfWindow()
> >>>
> >>> Detecting this and giving the simplest trigger for the job can lead to
> >>> further
> >>> optimizations, as it can for example reduce the amount of state the
> >>> trigger has to keep.
> >>>
> >>> That would actually be a very nice addition to have as in some cases it
> >>> can lead
> >>> to performance improvements.
> >>>
> >>> Thanks for the feedback!
> >>>
> >>> Kostas
> >>>
> >>>> On Aug 17, 2016, at 4:36 AM, Jark Wu <wuchong.wc@alibaba-inc.com>
> >>> wrote:
> >>>>
> >>>> Hi,
> >>>>
> >>>> It’s a cool design, I really like it !  I have two questions here.
> >>>>
> >>>> The first is whether do we have the complex composite triggers, i.e.
> >>> nested All and Any. Such as :
> >>>>
> >>>> Any(
> >>>>  All(trigger1, trigger2),
> >>>>  Any(trigger3, trigger4)
> >>>> )
> >>>>
> >>>> Can the above code work?
> >>>>
> >>>> Another question is : In composite triggers, what’s the behavior
of
> >>> withEarlyTrigger and withLateTrigger ? For example,
> >>>>
> >>>> All(EventTimeTrigger.afterEndOfWindow()
> >>>>                                .withEarlyTrigger(earlyFiringTrigger),
> >>>>    EventTimeTrigger.afterEndOfWindow()
> >>>>                                .withLateTrigger(lateFiringTrigger))
> >>>>
> >>>> Is it legal? Will the earlyFiringTrigger and lateFiringTrigger both
> >>> work  ?
> >>>>
> >>>>
> >>>> - Jark Wu
> >>>>
> >>>>> 在 2016年8月17日,上午12:24,Kostas Kloudas <k.kloudas@data-artisans.com>
> 写道:
> >>>>>
> >>>>> Hi Aljoscha,
> >>>>>
> >>>>> Thanks for the feedback!
> >>>>>
> >>>>> It is a nice feature to have. The reason it is not included in the
> FLIP
> >>>>> is that I have not seen somebody asking for something similar in
the
> >>>>> mailing list.
> >>>>>
> >>>>> A point that I have to add is that it seems (from the user ML) that
> >>>>> most of the times users expect the “Repeated.forever” behavior
to
> >>>>> be the default.
> >>>>>
> >>>>> Given this, I would say that we should make this the default and
> >>>>> add something like “Repeat.Once” option which will just let
the
> trigger
> >>>>> fire once, e.g. the first time the counter reaches 5 in your example,
> >>>>> and then stop.
> >>>>>
> >>>>> In other case, the trigger specification may become too verbose,
> >>>>> as the user will have to write the “Repeat.forever” for all
child
> >>> triggers.
> >>>>>
> >>>>> What do you think?
> >>>>>
> >>>>> Kostas
> >>>>>
> >>>>>> On Aug 16, 2016, at 4:38 PM, Aljoscha Krettek <aljoscha@apache.org>
> >>> wrote:
> >>>>>>
> >>>>>> Ah, I just read the document again and noticed that it might
be good
> >>> to
> >>>>>> differentiate between repeatable triggers and non-repeating
> triggers.
> >>> I'm
> >>>>>> proposing to make most triggers non-repeating with the addition
of a
> >>>>>> trigger that makes other triggers repeatable.
> >>>>>>
> >>>>>> Example Non-Repeating:
> >>>>>> EventTimeTrigger.pastEndOfWindow()
> >>>>>> .withEarlyFiring(CountTrigger.of(5))
> >>>>>>
> >>>>>> this gives me an early firing once I got 5 elements and then
an
> >>> on-time
> >>>>>> firing once the watermark passes the end of the window.
> >>>>>>
> >>>>>> Example with Repeating:
> >>>>>> EventTimeTrigger.pastEndOfWindow()
> >>>>>> .withEarlyFiring(Repeated.forever(CountTrigger.of(5)))
> >>>>>>
> >>>>>> this gives me early firings whenever I see 5 new elements plus
the
> >>>>>> watermark firing.
> >>>>>>
> >>>>>> What do you think?
> >>>>>>
> >>>>>> On Tue, 16 Aug 2016 at 15:31 Kostas Kloudas <
> >>> k.kloudas@data-artisans.com>
> >>>>>> wrote:
> >>>>>>
> >>>>>>> Thanks Till!
> >>>>>>>
> >>>>>>> Kostas
> >>>>>>>
> >>>>>>>> On Aug 16, 2016, at 3:30 PM, Till Rohrmann <trohrmann@apache.org>
> >>> wrote:
> >>>>>>>>
> >>>>>>>> Cool design doc Klou. It's well described with a lot
of details. I
> >>> like
> >>>>>>> it
> >>>>>>>> a lot :-) +1 for implementing the trigger DSL.
> >>>>>>>>
> >>>>>>>> Cheers,
> >>>>>>>> Till
> >>>>>>>>
> >>>>>>>> On Tue, Aug 16, 2016 at 3:18 PM, Kostas Kloudas <
> >>>>>>> k.kloudas@data-artisans.com
> >>>>>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Thanks for the feedback Ufuk!
> >>>>>>>>> I will do that.
> >>>>>>>>>
> >>>>>>>>>> On Aug 16, 2016, at 1:41 PM, Ufuk Celebi <uce@apache.org>
> wrote:
> >>>>>>>>>>
> >>>>>>>>>> Hey Kostas! Thanks for sharing the documents.
I think it makes
> >>> sense
> >>>>>>>>>> to merge the two documents by moving the Google
doc contents to
> >>> the
> >>>>>>>>>> Wiki. I think they form one unit.
> >>>>>>>>>>
> >>>>>>>>>> On Tue, Aug 16, 2016 at 12:34 PM, Kostas Kloudas
> >>>>>>>>>> <k.kloudas@data-artisans.com> wrote:
> >>>>>>>>>>> Hi all!
> >>>>>>>>>>>
> >>>>>>>>>>> I've created a FLIP for the trigger DSL.
This is the triggers
> >>>>>>>>>>> that we want Apache Flink to support out-of-the-box.
This
> >>> proposal
> >>>>>>>>>>> builds on various discussions on the mailing
list and aims at
> >>>>>>>>>>> serving as a base for further ones.
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>
> >>>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-9%3A+Trigger+DSL
> >>>>>>>>> <
> >>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-9:+Trigger+DSL>
> >>>>>>>>>>>
> >>>>>>>>>>> FLIP-9 provides a description of the triggers
Flink already
> >>> offers,
> >>>>>>>>>>> the new that we think should be added, how
the APIs could look
> >>> like,
> >>>>>>>>>>> some discussion on the implementation implications
and some
> ideas
> >>>>>>>>>>> on how to implement them.
> >>>>>>>>>>>
> >>>>>>>>>>> There is also a shared document giving a
bit more insight on
> the
> >>>>>>>>> implementation
> >>>>>>>>>>> implications. Feel free to read but please
keep the discussion
> >>> in the
> >>>>>>>>> mailing list.
> >>>>>>>>>>>
> >>>>>>>>>>> https://docs.google.com/a/data-artisans.com/document/d/
> >>>>>>>>> 1vESGQ913oR-DnE1jmFiihvLBU6_UDo-1DRgoHtSgu30/edit?usp=sharing
<
> >>>>>>>>> https://docs.google.com/a/data-artisans.com/document/d/
> >>>>>>>>> 1vESGQ913oR-DnE1jmFiihvLBU6_UDo-1DRgoHtSgu30/edit?usp=sharing>
> >>>>>>>>>>>
> >>>>>>>>>>> I would like to start working on an the
implementation next
> week.
> >>>>>>>>>>>
> >>>>>>>>>>> Let the discussion begin!
> >>>>>>>>>>>
> >>>>>>>>>>> Kostas
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>
> >>>
> >>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message