kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-6560) Use single-point queries than range queries for windowed aggregation operators
Date Thu, 15 Feb 2018 23:07:00 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-6560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16366392#comment-16366392

ASF GitHub Bot commented on KAFKA-6560:

guozhangwang opened a new pull request #4578: KAFKA-6560: Replace range query with newly added
single point query in Windowed Aggregation [WIP]
URL: https://github.com/apache/kafka/pull/4578
   *More detailed description of your change,
   if necessary. The PR title and PR message become
   the squashed commit message, so use a separate
   comment to ping reviewers.*
   *Summary of testing strategy (including rationale)
   for the feature or bug fix. Unit and/or integration
   tests are expected for any behaviour change and
   system tests should be considered for larger changes.*
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:

> Use single-point queries than range queries for windowed aggregation operators
> ------------------------------------------------------------------------------
>                 Key: KAFKA-6560
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6560
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: Guozhang Wang
>            Assignee: Guozhang Wang
>            Priority: Critical
>              Labels: needs-kip
> Today for windowed aggregations in Streams DSL, the underlying implementation is leveraging
the fetch(key, from, to) API to get all the related windows for a single record to update.
However, this is a very inefficient operation with significant amount of CPU time iterating
over window stores. On the other hand, since the operator implementation itself have full
knowledge of the window specs it can actually translate this operation into multiple single-point
queries with the accurate window start timestamp, which would largely reduce the overhead.
> The proposed approach is to add a single fetch API to the WindowedStore and use that
in the KStreamWindowedAggregate / KStreamWindowedReduce operators.

This message was sent by Atlassian JIRA

View raw message