flink-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Shaoxuan Wang <wshaox...@gmail.com>
Subject Re: [Discuss] Retraction for Flink Streaming
Date Thu, 16 Mar 2017 08:37:41 GMT
The doc gets lots of attention. I appreciate everyone for the valuable

Hi Fabian,
Thanks for your comments.

I agree with you that we should ensure that all operators are running in
the same mode (either turning on retraction globally or not). From my point
of view, I did not see any problem to support the retraction for all the
operators we have so far in Flink master (assuming we keep the window
aggregate in a "without early firing mode" as it is now, and not to handle
the late arrival after window is materialized as we agreed via the
commenting discussions in the google doc).

Can you please create a feature branch. We have a complete design on top of
Flink master. We would like to submit the design to the feature branch
asap, then everyone will get a deeper inside of the design with more


On Thu, Mar 16, 2017 at 12:54 AM, Fabian Hueske <fhueske@gmail.com> wrote:

> Hi Shaoxuan,
> thanks a lot for this proposal!
> Support for retractions is a super nice and important feature and will
> enable many more use cases for the Table API / SQL.
> I'm really excited to see this happening. I made a first pass over your
> proposal and added a few comments. I'll do another pass soon.
> Since it is only 6 weeks left until the feature freeze for Flink 1.3, I
> propose to develop the retraction support in a feature branch.
> IMO, we must make sure that either all operators support retraction or
> none. Otherwise, the behavior of the Table API / SQL will not be
> predictable.
> I also think that we should define which operators we want to support in
> Flink 1.3 in order to coordinate the development of retraction support.
> What do others think?
> Cheers, Fabian
> 2017-03-14 16:53 GMT+01:00 Shaoxuan Wang <wshaoxuan@gmail.com>:
> > Hello everyone,
> >
> > Flink is widely used in Alibaba Group, especially in our Search and
> > Recommendation Infra. Retraction is one of the most important features
> that
> > we needed. We have spent lots of efforts to try to solve this problem,
> and
> > gladly at the end we develop an approach which can address most of
> > retraction problems in our production scenarios. Same as usual, we
> (Alibaba
> > search-data infra team) would like to share our retraction solution to
> the
> > entire Flink community. If you like this proposal, I would also like to
> > make it as one of the FLIPs. I am attaching the design doc of "Retraction
> > for Flink Streaming" as well as the introduction section below. I have
> also
> > created a master jira (FLINK-6047) to track the discussion and design of
> > the Flink retraction. All suggestions and comments are welcome.
> >
> >
> > *Design doc:*
> > https://docs.google.com/document/d/18XlGPcfsGbnPSApRipJDLPg5IFNGT
> > Qjnz7emkVpZlkw
> >
> > *Introduction:*
> >
> > "Retraction" is an important building block for data streaming to refine
> > the early fired results in streaming. “Early firing” are very common and
> > widely used in many streaming scenarios, for instance “window-less” or
> > unbounded aggregate and stream-stream inner join, windowed (with early
> > firing) aggregate and stream-stream inner join. As described in Streaming
> > 102, there are mainly two cases that require retractions: 1) update on
> the
> > keyed table (the key is either a primaryKey (PK) on source table, or a
> > groupKey/partitionKey in an aggregate); 2) When dynamic windows (e.g.,
> > session window) are in use, the new value may be replacing more than one
> > previous window due to window merging.
> >
> > To the best of our knowledge, the retraction for the early fired
> streaming
> > results has never been practically solved before. In this proposal, we
> > develop a retraction solution and explain how it works for the problem of
> > “update on the keyed table”. The same solution can be easily extended for
> > the dynamic windows merging, as the key component of retraction - how to
> > refine an early fired results - is the same across different problems.
> >
> > *Master Jira: *
> > https://issues.apache.org/jira/browse/FLINK-6047
> >
> >
> > Regards,
> > Shaoxuan
> >

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message