beam-github mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [beam] boyuanzz commented on pull request #14268: [BEAM-12011] Eliminate WindowFn.getOutputTime method
Date Mon, 05 Apr 2021 17:00:40 GMT

boyuanzz commented on pull request #14268:
URL: https://github.com/apache/beam/pull/14268#issuecomment-813504955


   > > A high level question regarding to
   > > > it was added as response to a real user problem: sliding windows + EARLIEST
timestamp combiner delays downstream aggregations a lot
   > > > but when that problem occurred, timestamp combiner EARLIEST was the default;
with the change to how lateness is defined we changed the default to END_OF_WINDOW
   > > 
   > > 
   > > With this PR, what will happen to the real user problem mentioned above?
   > 
   > Short answer: the real user problem will be back but not as bad.
   > 
   > Longer answer: EARLIEST used to be the default when `getOutputTime` was introduced
to solve the problem. Now the default is END_OF_WINDOW so the problem is not as bad. To use
sliding windows (or other overlapping windows) with EARLIEST, the user has some choices:
   > 
   > * Just be OK with the delay of downstream GBK (already true for Python)
   > * Set up a non-default trigger (this will free watermark holds and allow progress
downstream)
   > * Manually move element timestamps forward with a `ParDo` before GBK, achieving identical
behavior
   > 
   > I think further discussion should continue on the dev@ thread instead of the PR though.
   
   Thanks for the explanation. If it's something more like a common issue, we should document
this in the release note or somewhere else.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



Mime
View raw message