kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jan Filipiak (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-3705) Support non-key joining in KTable
Date Fri, 10 Nov 2017 18:37:00 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-3705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16247892#comment-16247892

Jan Filipiak commented on KAFKA-3705:

[~thuey100] I am glad to see your interest in this. The pull request has discussions regarding
the client API. We are currenlty in the process setting this fixed in the KIP https://cwiki.apache.org/confluence/display/KAFKA/KIP-213+Support+non-key+joining+in+KTable.
The hardest part in the PR is still outstanding sadly. Merging Cache and Persistent Stores
in a prefix scan. We only run our code with it was easier back then. Or we just gonna
flush the cache just every time. Feel welcome to get involved!

> Support non-key joining in KTable
> ---------------------------------
>                 Key: KAFKA-3705
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3705
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>            Reporter: Guozhang Wang
>              Labels: api
> Today in Kafka Streams DSL, KTable joins are only based on keys. If users want to join
a KTable A by key {{a}} with another KTable B by key {{b}} but with a "foreign key" {{a}},
and assuming they are read from two topics which are partitioned on {{a}} and {{b}} respectively,
they need to do the following pattern:
> {code}
> tableB' = tableB.groupBy(/* select on field "a" */).agg(...); // now tableB' is partitioned
on "a"
> tableA.join(tableB', joiner);
> {code}
> Even if these two tables are read from two topics which are already partitioned on {{a}},
users still need to do the pre-aggregation in order to make the two joining streams to be
on the same key. This is a draw-back from programability and we should fix it.

This message was sent by Atlassian JIRA

View raw message