calcite-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Milinda Pathirage <mpath...@umail.iu.edu>
Subject Re: Triggering emits for streaming window aggregates
Date Fri, 03 Jul 2015 15:54:24 GMT
Hi Julian,

After debugging I found two possible causes for not having collation trait
in LogicalAggregate.

- RelMdCollation#project only handles projects of the type *RexInputRef*
and doesn't handle *RexCall*. Because of this we loose ordering information
related to function expressions
- When creating LogicalAggregate (line 93 to 95), we don't take traits of
the input into account.

I created two JIRA tickets to track the progress.

- https://issues.apache.org/jira/browse/CALCITE-784
- https://issues.apache.org/jira/browse/CALCITE-783

Thanks
Milinda

On Mon, Jun 29, 2015 at 4:37 PM, Ted Dunning <ted.dunning@gmail.com> wrote:

> Another complicating issue is the practical requirement that often comes up
> that aggregates for late arriving data be kept separate for the same window
> that arrived on time.  This allows late arriving aggregates to be reported
> separately.  This is a fundamental change in the meaning of windowed
> aggregates, of course, but it is also a common requirement.
>
>
>
> On Mon, Jun 29, 2015 at 7:25 AM, Milinda Pathirage <mpathira@umail.iu.edu>
> wrote:
>
> > Hi Ted,
> >
> > We have discussed most of the complexities related to window handling in
> a
> > different thread [1]. My bad that I didn't provide those additional
> details
> > when I started this thread. We have a window store (implemented on top of
> > Samza's local storage) to keep track of old windows to trigger new
> results
> > for late arrivals. Document [2]  discusses most of the things related to
> > window store's design.
> >
> > Thanks
> > Milinda
> >
> > [1] https://issues.apache.org/jira/browse/SAMZA-552
> > [2]
> >
> >
> https://issues.apache.org/jira/secure/attachment/12708934/DESIGN-SAMZA-552-7.pdf
> >
> > On Sun, Jun 28, 2015 at 2:21 AM, Ted Dunning <ted.dunning@gmail.com>
> > wrote:
> >
> > > Here is the biggest recent thread on this.  You might also ask directly
> > > what they think about the algebraic issue as you see it.
> > >
> > >
> > >
> > >
> >
> https://mail-archives.apache.org/mod_mbox/flink-dev/201506.mbox/%3CCANMXwW3bOgaJhG_syH2%3D0x5BcdukyTOF0dU3dM4_3yQK2UHoyw%40mail.gmail.com%3E
> > >
> > > Here are some thoughts that mostly deal with implementation, but also
> > > discuss a few theoretical aspects.  These then link into concepts such
> as
> > > data types (Flink recognized sortedness in type information, for
> > instance),
> > > the snaphost algorithms (because window triggers are very similar to
> the
> > > Lamport/Chandry algorithms used for snapshots and state handling), the
> > > optimizer (only a side comment in this regard) and other aspects.
> > >
> > >
> > >
> >
> https://docs.google.com/document/d/1rSoHyhUhm2IE30o5tkR8GEetjFvMRMNxvsCfoPsW6_4/edit#heading=h.faju7vv5ilgm
> > >
> > > On Sun, Jun 28, 2015 at 12:48 AM, Julian Hyde <jhyde@apache.org>
> wrote:
> > >
> > > > Ted,
> > > >
> > > > Do you have a link to a pertinent email thread from the Flink list?
> > > >
> > > > I can see how shifting from monotonic to k-sorted or punctuation
> could
> > > > make a big impact to the runtime of a streaming system like Flink.
> But
> > I
> > > > don’t think the impact on the algebra is as big, and that’s what
> we’re
> > > > concerned with in Calcite.
> > > >
> > > > Julian
> > > >
> > > >
> > > > > On Jun 26, 2015, at 11:18 PM, Ted Dunning <ted.dunning@gmail.com>
> > > wrote:
> > > > >
> > > > > On Sat, Jun 27, 2015 at 1:13 AM, Julian Hyde <jhyde@apache.org>
> > wrote:
> > > > >
> > > > >> Algebraic reasoning based on monotonicity can be extended to
the
> > other
> > > > >> models. If we start with the more complex models we'd soon we
up
> to
> > > > >> our hubcaps in theoretical mud.
> > > > >>
> > > > >
> > > > > As you like.  Flink has just had to rip up and repair a bunch of
> > stuff
> > > > > precisely because they started with an assumption of monotonicity
> and
> > > had
> > > > > to move to a looser model.  The practical impact was pretty
> > substantial
> > > > and
> > > > > substantially larger than the comments here would imply.
> > > >
> > > >
> > >
> >
> >
> >
> > --
> > Milinda Pathirage
> >
> > PhD Student | Research Assistant
> > School of Informatics and Computing | Data to Insight Center
> > Indiana University
> >
> > twitter: milindalakmal
> > skype: milinda.pathirage
> > blog: http://milinda.pathirage.org
> >
>



-- 
Milinda Pathirage

PhD Student | Research Assistant
School of Informatics and Computing | Data to Insight Center
Indiana University

twitter: milindalakmal
skype: milinda.pathirage
blog: http://milinda.pathirage.org

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message