kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ewen Cheslack-Postava (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-3209) Support single message transforms in Kafka Connect
Date Tue, 03 May 2016 17:28:12 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-3209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15269148#comment-15269148

Ewen Cheslack-Postava commented on KAFKA-3209:

To help clarify [~Skandragon]'s comment a bit, the idea is that the records are going to be
small compared to the headers. This means that the approach we might normally suggest -- doing
the flatMap transformation with an application or stream processor, storing that data back
to Kafka, then using Connect to store the data to another system -- will have very high overhead.

Whereas most of the message transforms we've discussed so far are either simple map() or filter()
transformations, this is a case where we might want to generate multiple output messages from
a single input message. The API for supporting this is obviously straightforward -- just support
returning a list of messages from the transformation instead of a single message. However,
I think the main challenge is that message offsets either aren't unique anymore or we'd need
to extend the concept of offset to account for "sub-messages".

> Support single message transforms in Kafka Connect
> --------------------------------------------------
>                 Key: KAFKA-3209
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3209
>             Project: Kafka
>          Issue Type: Improvement
>          Components: KafkaConnect
>            Reporter: Neha Narkhede
> Users should be able to perform light transformations on messages between a connector
and Kafka. This is needed because some transformations must be performed before the data hits
Kafka (e.g. filtering certain types of events or PII filtering). It's also useful for very
light, single-message modifications that are easier to perform inline with the data import/export.

This message was sent by Atlassian JIRA

View raw message