kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Clemens Valiente (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-5648) make Merger extend Aggregator
Date Thu, 27 Jul 2017 13:10:00 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-5648?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16103187#comment-16103187
] 

Clemens Valiente commented on KAFKA-5648:
-----------------------------------------

[~mjsax]
Thanks for your feedback!
What things would this change break? At the moment, any existing {{Merger<K,V>}} class
would already automatically have implemented {{Aggregator<K,V,V>}}, since their method
{{apply(K key, V aggOne, V aggTwo)}} also implements the Aggregator interfaces' {{apply}}
method.

It's true that I can just create {{MyMerger implements Merger<K,V>, Aggregator<K,V,V>}}
and use that (I can actually put my logic in the {{apply}} method and don't need a private
one). But with the suggested change the relationship would probably be more explicit for the
user and help them realize they might actually only need one class for both.
The way I see it the change would come at no additional cost or overhead

> make Merger extend Aggregator
> -----------------------------
>
>                 Key: KAFKA-5648
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5648
>             Project: Kafka
>          Issue Type: New Feature
>          Components: streams
>    Affects Versions: 0.11.0.0
>            Reporter: Clemens Valiente
>            Assignee: Clemens Valiente
>            Priority: Minor
>
> Hi,
> I suggest that Merger<K,V> should extend Aggregator<K,V,V>.
> reason:
> Both classes usually do very similar things. A merger takes two sessions and combines
them, an aggregator takes an existing session and aggregates new values into it.
> in some use cases it is actually the same thing, e.g.:
> <null, log_event> -> .map() to <session_id,SingletonList<log_event>>
-> .groupByKey().aggregate() to <session_id, List<log_event>>
> In this case both merger and aggregator do the same thing: take two lists and combine
them into one.
> With the proposed change we could pass the Merger as both the merger and aggregator to
the .aggregate() method and keep our business logic within one merger class.
> Or in other words: The Merger is simply an Aggregator that happens to aggregate two objects
of the same class



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message