kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Nisarg Shah <snis...@gmail.com>
Subject Re: Kafka Connect Transformers
Date Fri, 01 Jul 2016 18:24:09 GMT
Need to submit a KIP for https://issues.apache.org/jira/browse/KAFKA-3209. Please provide wiki
write access to ‘snisarg’. 

Thanks,
Nisarg Shah.

> On Jun 28, 2016, at 6:27 PM, Nisarg Shah <snisarg@gmail.com> wrote:
> 
> Need permissions to edit the wiki. Username is ‘snisarg’. 
> 
> Thanks,
> Nisarg.
> 
>> On Jun 28, 2016, at 09:08, Nisarg Shah <snisarg@gmail.com <mailto:snisarg@gmail.com>>
wrote:
>> 
>> Hello,
>> 
>> I need to create a page so that I can write a Kafka Improvement Proposal for the
below. My username is ‘snisarg’. 
>> 
>> Thanks,
>> Nisarg
>> 
>>> On Jun 19, 2016, at 10:43 PM, Nisarg Shah <snisarg@gmail.com <mailto:snisarg@gmail.com>>
wrote:
>>> 
>>> Hello,
>>> 
>>> I am looking to do https://issues.apache.org/jira/browse/KAFKA-3209 <https://issues.apache.org/jira/browse/KAFKA-3209>.
I wanted feedback from the devs for the design that I’m proposing to put in place. Thanks
a lot for all the discussions Ewen Cheslack-Postava.
>>> 
>>> A gist of how I plan to do it is by using ‘Transformers’ that can be configurationally
chained together and data will pass through them between a source and destination for Kafka
Connect.
>>> 
>>> To set up transformers, we propose using the properties to define Transformer
classes one after the other. 
>>> transformer=abc.Transformer1,xyz.Transformer2
>>> 
>>> Each transformer can get specific properties passed on from the same properties
file, as it is with the Connectors.
>>> 
>>> About the actual signature for the transformation function that does all the
work, how’s this interface? 
>>> public abstract class Transformer<T1, T2> {
>>>     public abstract T2 transform(T1 t1);
>>> 
>>>     public void initialize(Map<String, String> props) {}
>>> }
>>> 
>>> Approach 1:
>>> Functionally, the complete data can be passed. 
>>> Just as the *Tasks get a complete List<*Record>, the transformer can get
the same. The whole list passing makes rearranging or merging data possible. This can be helpful
if transformations require looking up or down the messages. Allowing custom datatypes between
transformers will allow custom objects to be passed around intermediate. Casting could be
an issue.
>>> 
>>> Approach 2: 
>>> Taking a simplistic approach and doing a message by message transformation. The
transformer could store data from the previous message, but not go down the list of messages.
From the comments by Michael Graff, both approaches would work, but if down looking is required,
we would have to go with Approach 1. 
>>> 
>>> I will also have a working change ready for Approach 1 very soon but till then,
please give me your suggestions. 
>>> 
>>> Thanks,
>>> Nisarg.
>>> 
>>> 
>>> 
>>> 
>> 
> 


Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message