samza-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From xinyuiscool <...@git.apache.org>
Subject [GitHub] samza pull request #456: SAMZA-1627: Watermark broadcast enhancements
Date Mon, 26 Mar 2018 16:59:52 GMT
GitHub user xinyuiscool opened a pull request:

    https://github.com/apache/samza/pull/456

    SAMZA-1627: Watermark broadcast enhancements

    Currently each upstream task needs to broadcast to every single partition of intermediate
streams in order to aggregate watermarks in the consumers. A better way to do this is to have
only one downstream consumer doing the aggregation, and then broadcast to all the partitions.
This is safe as we can prove the broadcast watermark message is after all the upstream tasks
finished producing the events that before the event time before this watermark. This reduced
the full message count from O(n^2) to O(n).

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/xinyuiscool/samza SAMZA-1627

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/samza/pull/456.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #456
    
----
commit 9c43008c2cc3c79d0659dec0e608d7e6f5a8f63a
Author: xinyuiscool <xiliu@...>
Date:   2018-03-22T21:32:37Z

    SAMZA-1627: Watermark broadcast enhancements

----


---

Mime
View raw message