kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias J. Sax (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (KAFKA-4609) KTable/KTable join followed by groupBy and aggregate/count can result in duplicated results
Date Fri, 26 Jan 2018 21:28:00 GMT

     [ https://issues.apache.org/jira/browse/KAFKA-4609?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Matthias J. Sax updated KAFKA-4609:
-----------------------------------
    Summary: KTable/KTable join followed by groupBy and aggregate/count can result in duplicated
results  (was: KTable/KTable join followed by groupBy and aggregate/count can result in incorrect
results)

> KTable/KTable join followed by groupBy and aggregate/count can result in duplicated results
> -------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-4609
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4609
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 0.10.1.1, 0.10.2.0
>            Reporter: Damian Guy
>            Assignee: Damian Guy
>            Priority: Major
>              Labels: architecture
>
> When caching is enabled, KTable/KTable joins can result in duplicate values being emitted.
This will occur if there were updates to the same key in both tables. Each table is flushed
independently, and each table will trigger the join, so you get two results for the same key.

> If we subsequently perform a groupBy and then aggregate operation we will now process
these duplicates resulting in incorrect aggregated values. For example count will be double
the value it should be.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message