kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-6560) Use single-point queries than range queries for windowed aggregation operators
Date Tue, 03 Apr 2018 01:18:00 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-6560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16423341#comment-16423341

ASF GitHub Bot commented on KAFKA-6560:

guozhangwang opened a new pull request #4814: KAFKA-6560: Use single query for getters as
URL: https://github.com/apache/kafka/pull/4814
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:

> Use single-point queries than range queries for windowed aggregation operators
> ------------------------------------------------------------------------------
>                 Key: KAFKA-6560
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6560
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: Guozhang Wang
>            Assignee: Guozhang Wang
>            Priority: Critical
>              Labels: needs-kip
>             Fix For: 1.2.0
> Today for windowed aggregations in Streams DSL, the underlying implementation is leveraging
the fetch(key, from, to) API to get all the related windows for a single record to update.
However, this is a very inefficient operation with significant amount of CPU time iterating
over window stores. On the other hand, since the operator implementation itself have full
knowledge of the window specs it can actually translate this operation into multiple single-point
queries with the accurate window start timestamp, which would largely reduce the overhead.
> The proposed approach is to add a single fetch API to the WindowedStore and use that
in the KStreamWindowedAggregate / KStreamWindowedReduce operators.

This message was sent by Atlassian JIRA

View raw message