kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Guozhang Wang (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (KAFKA-4281) Should be able to forward aggregation values immediately
Date Fri, 11 Nov 2016 20:50:58 GMT

     [ https://issues.apache.org/jira/browse/KAFKA-4281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel

Guozhang Wang updated KAFKA-4281:
    Status: Patch Available  (was: Open)

> Should be able to forward aggregation values immediately
> --------------------------------------------------------
>                 Key: KAFKA-4281
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4281
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions:
>            Reporter: Greg Fodor
>            Assignee: Greg Fodor
> KIP-63 introduced changes to the behavior of aggregations such that the result of aggregations
will not appear to subsequent processors until a state store flush occurs. This is problematic
for latency sensitive aggregations since flushes occur generally at commit.interval.ms, which
is usually a few seconds. Combined with several aggregations, this can result in several seconds
of latency through a topology for steps dependent upon aggregations.
> Two potential solutions:
> - Allow finer control over the state store flushing intervals
> - Allow users to change the behavior so that certain aggregations will immediately forward
records to the next step (as was the case pre-KIP-63)
> A PR is attached that takes the second approach. To add this unfortunately a large number
of files needed to be touched, and this effectively doubles the number of method signatures
around grouping on KTable and KStream. I tried an alternative approach that let the user opt-in
to immediate forwarding via an additional builder method on KGroupedStream/Table but this
didn't work as expected because in order for the latency to go away, the KTableImpl itself
must also mark its source as forward immediate (otherwise we will still see latency due to
the materialization of the KTableSource still relying upon state store flushes to propagate.)

This message was sent by Atlassian JIRA

View raw message