kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Musselman (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-3545) Generalized Serdes for List/Map
Date Mon, 17 Oct 2016 19:01:02 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-3545?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15583131#comment-15583131

Andrew Musselman commented on KAFKA-3545:

[~guozhang] and [~gfodor] looks like the two tickets you mention are resolved; can we close
this one?

> Generalized Serdes for List/Map
> -------------------------------
>                 Key: KAFKA-3545
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3545
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: Greg Fodor
>            Priority: Minor
>              Labels: api, newbie
> In working with Kafka Streams I've found it's often the case I want to perform a "group
by" operation, where I repartition a stream based on a foreign key and then do an aggregation
of all the values into a single collection, so the stream becomes one where each entry has
a value that is a serialized list of values that belonged to the key. (This seems unrelated
to the 'group by' operation talked about in KAFKA-3544.) Basically the same typical group
by operation found in systems like Cascading.
> In order to create these intermediate list values I needed to define custom avro schemas
that simply wrap the elements of interest into a list. It seems desirable that there be some
basic facility for constructing simple Serdes of Lists/Maps/Sets of other types, potentially
using avro's serialization under the hood. If this existed in the core library it would also
enable the addition of higher level operations on streams that can use these Serdes to perform
simple operations like the "group by" example I mention.

This message was sent by Atlassian JIRA

View raw message