Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 9B6BE200D64 for ; Tue, 26 Dec 2017 18:47:35 +0100 (CET) Received: by cust-asf.ponee.io (Postfix) id 99F6C160C13; Tue, 26 Dec 2017 17:47:35 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 41A39160C0B for ; Tue, 26 Dec 2017 18:47:34 +0100 (CET) Received: (qmail 43207 invoked by uid 500); 26 Dec 2017 17:47:28 -0000 Mailing-List: contact dev-help@kafka.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@kafka.apache.org Delivered-To: mailing list dev@kafka.apache.org Received: (qmail 43195 invoked by uid 99); 26 Dec 2017 17:47:27 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Dec 2017 17:47:27 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 442B71A0B84 for ; Tue, 26 Dec 2017 17:47:27 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.379 X-Spam-Level: ** X-Spam-Status: No, score=2.379 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id OMPOWRY10EKb for ; Tue, 26 Dec 2017 17:47:19 +0000 (UTC) Received: from mail-pl0-f41.google.com (mail-pl0-f41.google.com [209.85.160.41]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 7036F5F474 for ; Tue, 26 Dec 2017 17:47:18 +0000 (UTC) Received: by mail-pl0-f41.google.com with SMTP id g2so18173081pli.8 for ; Tue, 26 Dec 2017 09:47:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=0iBOOyP3tD1iN9bpMWd7BHEs86EgJWhFI5bT9kyAUYI=; b=YAwGg41w/j9JCh+YUCEs6evK20HSHk63/nAmrfaGiCv+/VTATAnC3qWP6QlWDDCjvd YmqBNB4N/l7dpMZiXCZlfJIUwydfKOQTz8zEQpixQz3QfxKEuFnnVp9e8/PPnnxFoEwj Rv0VME1oalipZRZ1L7Y03MFyHBAzPUpOaSlvhEPnyCkUTHdAUjSasDdOcmp2RRXhGvZ1 vx7aLhKTqGkjIwbQsOnm3KHMTcOmQATXMo4eRp0BbTv1ZxRHUjHrScYK6U2HroG2rqBx YXhth0nwVD5ufUy56cnUkCfix6eCy6CCyYVsoprji3HTD51z209NdzUDDno5AeJj8w2d 0VdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=0iBOOyP3tD1iN9bpMWd7BHEs86EgJWhFI5bT9kyAUYI=; b=iYTMWLb7R1Jm3Y0X9TAQd5XQtEwrNpiVV4W8YznE+bHzBWFPK15+ckcRa2Z5FYufMy tGfFJF7uf8qbEg3hmAHFDp6zXgAnhToj/EwMGG4fYQ1kYHbP8o0K7gIlKQkaA5VKjGbT 6NtMCAEnZhMK12ENaiJ1e5D1yP4goBiHnpDUQRjMyM69pfPhl8fn+zIXzzq+hJL8vFmY hfrA1obdNiXiwBANSipW9Ke8eIeGoC4wpK0WOCVc+Pcj+gDJkUmDEn7ArYAlHi/EJjI3 0aJFysh0PitRoERTSGXXcO2rYSpViLPlteuUHRH1UxsstZVgJkevRuc0MXbCme1+gtIB y0dw== X-Gm-Message-State: AKGB3mLee5J+eqWBVKlA+xVLC796s64ekHm2wMslzddgYL7H6w/UgVRI ircwO+/9UTKTA1g0M2JIxyMRdQLL+6ZsD+yE00+PPQ== X-Google-Smtp-Source: ACJfBouOJobMO5yrE0i3mspYLr7zwri6Y9TGQGF/pSpMc6sU3FH5apxStRyVcYwRO/G+RIJQR90FB2FazyV2sqp8CxI= X-Received: by 10.159.198.148 with SMTP id g20mr25677280plo.89.1514310436831; Tue, 26 Dec 2017 09:47:16 -0800 (PST) MIME-Version: 1.0 Received: by 10.100.130.150 with HTTP; Tue, 26 Dec 2017 09:47:16 -0800 (PST) In-Reply-To: References: <87f1063ce0ef460090596e94690b06c2@BMPRDEXC142.IGI.IG.LOCAL> <3427DE88-0CA4-4307-BD52-D9078E99A9CF@me.com> <60A37B8E-F8E8-49EE-90A8-E7B5F6C7E012@ig.com> <65539CBD-888D-4360-AF3C-AB05079F7196@me.com> <8F2867A7-D8E9-4256-B7D1-D46B91188C84@me.com> From: Randall Hauch Date: Tue, 26 Dec 2017 11:47:16 -0600 Message-ID: Subject: Re: [DISCUSS] KIP 145 - Expose Record Headers in Kafka Connect To: dev@kafka.apache.org Content-Type: multipart/alternative; boundary="94eb2c1e0278aac572056141df86" archived-at: Tue, 26 Dec 2017 17:47:35 -0000 --94eb2c1e0278aac572056141df86 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Does anyone have any thoughts about this proposal for Connect header support? On Thu, Dec 21, 2017 at 4:14 PM, Randall Hauch wrote: > All, > > I've updated KIP-145 to reflect my proposal. The proposal addresses SMTs > and a different HeaderConverter default, but I'll be updating my PR ( > https://github.com/apache/kafka/pull/4319) soon. Feedback is very welcome= ! > > Best regards, > > Randall > > On Thu, Dec 14, 2017 at 10:20 AM, Randall Hauch wrote: > >> Hi, Michael. Yeah, I liked your PR a lot, and there definitely are a lot >> of similarities. But here are the more significant differences from my >> perspective (none of which are really that big): >> >> First, your `SubjectConverter` and my `HeaderConverter` are pretty >> similar -- mine is just more closely tied to headers. Also, we used >> slightly different approaches to dealing with the fact that the `Convert= er` >> interface does not extend `Configurable`, which Connect now uses for >> transforms, connectors, etc. And our implementations take very different >> approaches (see below). >> >> Second, I tried to follow Kafka client's `Header` and `Headers` >> interfaces (at least in concept) so that ConnectRecord has a `Headers` >> rather than a list of headers. It's a minor distinction, but I do think >> it's important for future-proofing to have an interface for the collecti= on >> to abstract and encapsulate logic/behavior as well as leaving room for >> alternative implementations. It also a convenient place to add methods f= or >> source connectors and SMTs to easily add/modify/remove/transform headers= . >> >> Third, our "header converter" implementations are where most of the >> differences lie. Again, this goes back to my assertion that we should ma= ke >> the serdes and cast/conversion orthogonal. If we allow sink connectors a= nd >> SMTs to get header values in the type they want (e.g., >> `Header.valueAsFloat()`), then we can tolerate a bit more variation in h= ow >> the header values are serialized and deserialized, since the serdes >> mechanism doesn't have to get the type exactly right for the sink connec= tor >> and SMT. My `SimpleHeaderConverter` serializes all of the types to strin= gs, >> but during deserialization it attempts to infer the schemas (easy for >> primitive values, a bit harder for structured types). IIUC, neither your >> approach or mine is really able to maintain Struct schemas, but IMO we c= an >> add that over time with improved/different header converters if people >> really need it. >> >> Fourth, we use different defaults for the serdes implementation. I >> dislike the StringConverter because it converts everything to strings th= at >> are then difficult to convert back to the original form, especially for = the >> structured types. This is why I created the `SimpleHeaderConverter` >> implementation, which doesn't need explicit configuration or explicit >> mapping of header names to types, and thus can be used as the default. >> >> Finally, while I hope that `SimpleHeaderConverter` and its schema >> inference will work most of the time with no special configuration, >> especially since the `Header` interface makes it easy to cast/convert in >> sink connectors and SMTs, I do like how your `PrimativeSubjectConverter` >> allows the user to manually control how the values are serialized. I >> thought of doing something similar, but I think that can be done at a la= ter >> time if/when needed. >> >> I hope that makes sense. >> >> Randall >> >> On Tue, Dec 12, 2017 at 11:35 PM, Michael Andr=C3=A9 Pearce < >> michael.andre.pearce@me.com> wrote: >> >>> Hi Randall >>> >>> What=E2=80=99s the main difference between this and my earlier alternat= ive >>> option PR >>> https://github.com/apache/kafka/pull/2942/files >>> >>> If none then +1. >>> From what I can tell the only difference I make is the headers you >>> support being able to cross convert primitive types eg if value after >>> conversion is integer you can still ask for float and it will type conc= ert >>> if possible. >>> >>> Cheers >>> Mike >>> >>> >>> Sent from my iPhone >>> >>> > On 13 Dec 2017, at 01:36, Randall Hauch wrote: >>> > >>> > Trying to revive this after several months of inactivity.... >>> > >>> > I've spent quite a bit of time evaluating the current KIP-145 proposa= l >>> and >>> > several of the suggested PRs. The original KIP-145 proposal is >>> relatively >>> > minimalist (which is very nice), and it adopts Kafka's approach to >>> headers >>> > where header keys are strings and header values are byte arrays. IMO, >>> this >>> > places too much responsibility on the connector developers to know ho= w >>> to >>> > serialize and deserialize, which means that it's going to be difficul= t >>> to >>> > assemble into pipelines connectors and stream processors that make >>> > different, incompatible assumptions. It also makes Connect headers ve= ry >>> > different than Connect's keys and values, which are generally >>> structured >>> > and describable with Connect schemas. I think we need Connect headers >>> to do >>> > more. >>> > >>> > The other proposals attempt to do more, but even my first proposal >>> doesn't >>> > seem to really provide a solution that works for Connect users and >>> > connector developers. After looking at this feature from a variety of >>> > perspectives over several months, I now assert that Connect must solv= e >>> two >>> > orthogonal problems: >>> > >>> > 1) Serialization: How different data types are (de)serialized as head= er >>> > values >>> > 2) Conversion: How values of one data type are converted to values of >>> > another data type >>> > >>> > For the serialization problem, Ewen suggested quite a while back that >>> we >>> > use something akin to `Converter` for header values. Unfortunately we >>> can't >>> > directly reuse `Converters` since the method signatures don't allow u= s >>> to >>> > supply the header name and the topic name, but we could define a >>> > `HeaderConverter` that is similar to and compatible with `Converter` >>> such >>> > that a single class could implement both. This would align Connector >>> > headers with how message keys and values are handled. Each connector >>> could >>> > define which converter it wants to use; for backward compatibility >>> purposes >>> > we use a header converter by default that serialize values to strings= . >>> If >>> > you want something other than this default, you'd have to specify the >>> > header converter options as part of the connector configuration; this >>> > proposal changes the `StringConverter`, `ByteArrayConverter`, and >>> > `JsonConverter` to all implement `HeaderConverter`, so these are all >>> > options. This approach supposes that a connector will serialize all o= f >>> its >>> > headers in the same way -- with string-like representations by >>> default. I >>> > think this is a safe assumption for the short term, and if we need mo= re >>> > control to (de)serialize named headers differently for the same >>> connector, >>> > we can always implement a different `HeaderConverter` that gives user= s >>> more >>> > control. >>> > >>> > So that would solve the serialization problem. How about connectors a= nd >>> > transforms that are implemented to expect a certain type of header >>> value, >>> > such as an integer or boolean or timestamp? We could solve this probl= em >>> > (for the most part) by adding methods to the `Header` interface to ge= t >>> the >>> > value in the desired type, and to support all of the sensible >>> conversions >>> > between Connect's primitives and logical types. So, a connector or >>> > transform could always call `header.valueAsObject()` to get the raw >>> > representation from the converter, but a connector or transform could >>> also >>> > get the string representation by calling `header.valueAsString()`, or >>> the >>> > INT64 representation by calling `header.valueAsLong()`, etc. We could >>> even >>> > have converting methods for the built-in logical types (e.g., >>> > `header.valueAsTimestamp()` to return a java.util.Date value that is >>> > described by Connect's Timestamp logical type). We can convert betwee= n >>> most >>> > primitive and logical types (e.g., anything to a STRING, INT32 to >>> FLOAT32, >>> > etc.), but there are a few that don't make sense (e.g., ARRAY to >>> FLOAT32, >>> > INT32 to STRUCT, BYTE_ARRAY to anything, etc.), so these can throw a >>> > `DataException`. >>> > >>> > I've refined this approach over the last few months, and have a PR fo= r >>> a >>> > complete prototype that demonstrates these concepts and techniques: >>> > https://github.com/apache/kafka/pull/4319 >>> > >>> > This PR does *not* update the documentation, though I can add that if >>> we >>> > approve of this approach. And, we probably want to define (at least o= n >>> the >>> > KIP) some relatively obvious SMTs for copying header values into reco= rd >>> > key/value fields, and extracting record key/value fields into header >>> values. >>> > >>> > @Michael, would you mind if I edited KIP-145 to reflect this proposal= ? >>> I >>> > would be happy to keep the existing proposal at the end of the >>> document (or >>> > remove it if you prefer, since it's already in the page history), and >>> we >>> > can revise as we choose a direction. >>> > >>> > Comments? Thoughts? >>> > >>> > Best regards, >>> > >>> > Randall >>> > >>> > >>> > On Thu, Oct 19, 2017 at 2:10 PM, Michael Andr=C3=A9 Pearce < >>> > michael.andre.pearce@me.com> wrote: >>> > >>> >> @rhauch >>> >> >>> >> Here is the previous discussion thread, just reigniting so we can >>> discuss >>> >> against the original kip thread >>> >> >>> >> >>> >> Cheers >>> >> >>> >> Mike >>> >> >>> >> Sent from my iPhone >>> >> >>> >>> On 5 May 2017, at 02:21, Michael Pearce >>> wrote: >>> >>> >>> >>> Hi Ewen, >>> >>> >>> >>> Did you get a chance to look at the updated sample showing the idea= ? >>> >>> >>> >>> Did it help? >>> >>> >>> >>> Cheers >>> >>> Mike >>> >>> >>> >>> Sent using OWA for iPhone >>> >>> ________________________________________ >>> >>> From: Michael Pearce >>> >>> Sent: Wednesday, May 3, 2017 10:11:55 AM >>> >>> To: dev@kafka.apache.org >>> >>> Subject: Re: [DISCUSS] KIP 145 - Expose Record Headers in Kafka >>> Connect >>> >>> >>> >>> Hi Ewen, >>> >>> >>> >>> As code I think helps, as I don=E2=80=99t think I explained what I = meant very >>> >> well. >>> >>> >>> >>> I have pushed what I was thinking to the branch/pr. >>> >>> https://github.com/apache/kafka/pull/2942 >>> >>> >>> >>> The key bits added on top here are: >>> >>> new ConnectHeader that holds the header key (as string) and then >>> header >>> >> value object header value schema >>> >>> >>> >>> new SubjectConverter which allows exposing a subject, in this case >>> the >>> >> subject is the key. - this can be used to register the header type i= n >>> repos >>> >> like schema registry, or in my case below in a property file. >>> >>> >>> >>> >>> >>> We can default the subject converter to String based of Byte based >>> where >>> >> all header values are treated safely as String or byte[] type. >>> >>> >>> >>> But this way you could add in your own converter which could be mor= e >>> >> sophisticated and convert the header based on the key. >>> >>> >>> >>> The main part is to have access to the key, so you can look up the >>> >> header value type, based on the key from somewhere, aka a properties >>> file, >>> >> or some central repo (aka schema repo), where the repo subject could >>> be the >>> >> topic + key, or just key if key type is global, and the schema could >>> be >>> >> primitive, String, byte[] or even can be more elaborate. >>> >>> >>> >>> Cheers >>> >>> Mike >>> >>> >>> >>> On 03/05/2017, 06:00, "Ewen Cheslack-Postava" >>> wrote: >>> >>> >>> >>> Michael, >>> >>> >>> >>> Aren't JMS headers an example where the variety is a problem? >>> Unless >>> >> I'm >>> >>> misunderstanding, there's not even a fixed serialization format >>> >> expected >>> >>> for them since JMS defines the runtime types, not the wire format= . >>> For >>> >>> example, we have JMSCorrelationID (String), JMSExpires (Long), an= d >>> >>> JMSReplyTo (Destination). These are simply run time types, so we'= d >>> >> need >>> >>> either (a) a different serializer/deserializer for each or (b) a >>> >>> serializer/deserializer that can handle all of them (e.g. Avro, >>> JSON, >>> >> etc). >>> >>> >>> >>> What is the actual serialized format of the different fields? And >>> if >>> >> it's >>> >>> not specified anywhere in the KIP, why should using the well-know= n >>> >> type for >>> >>> the header key (e.g. use StringSerializer, IntSerializer, etc) be >>> >> better or >>> >>> worse than using a general serialization format (e.g. Avro, JSON)= ? >>> >> And if >>> >>> the latter is the choice, how do you decide on the format? >>> >>> >>> >>> -Ewen >>> >>> >>> >>> On Tue, May 2, 2017 at 12:48 PM, Michael Andr=C3=A9 Pearce < >>> >>> michael.andre.pearce@me.com> wrote: >>> >>> >>> >>>> Hi Ewan, >>> >>>> >>> >>>> So on the point of JMS the predefined/standardised JMS and JMSX >>> headers >>> >>>> have predefined types. So these can be serialised/deserialised >>> >> accordingly. >>> >>>> >>> >>>> Custom jms headers agreed could be a bit more difficult but on the >>> 80/20 >>> >>>> rule I would agree mostly they're string values and as anyhow you >>> can >>> >> hold >>> >>>> bytes as a string it wouldn't cause any issue, defaulting to that. >>> >>>> >>> >>>> But I think easily we maybe able to do one better. >>> >>>> >>> >>>> Obviously can override the/config the headers converter but we can >>> >> supply >>> >>>> a default converter could take a config file with key to type >>> mapping? >>> >>>> >>> >>>> Allowing people to maybe define/declare a header key with the >>> expected >>> >>>> type in some property file? To support string, byte[] and >>> primitives? >>> >> And >>> >>>> undefined headers just either default to String or byte[] >>> >>>> >>> >>>> We could also pre define known headers like the jms ones mentioned >>> >> above. >>> >>>> >>> >>>> E.g >>> >>>> >>> >>>> AwesomeHeader1=3Dboolean >>> >>>> AwesomeHeader2=3Dlong >>> >>>> JMSCorrelationId=3DString >>> >>>> JMSXGroupId=3DString >>> >>>> >>> >>>> >>> >>>> What you think? >>> >>>> >>> >>>> >>> >>>> Cheers >>> >>>> Mike >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> Sent from my iPhone >>> >>>> >>> >>>>> On 2 May 2017, at 18:45, Ewen Cheslack-Postava >>> >>>> wrote: >>> >>>>> >>> >>>>> A couple of thoughts: >>> >>>>> >>> >>>>> First, agreed that we definitely want to expose header >>> functionality. >>> >>>> Thank >>> >>>>> you Mike for starting the conversation! Even if Connect doesn't d= o >>> >>>> anything >>> >>>>> special with it, there's value in being able to access/set header= s. >>> >>>>> >>> >>>>> On motivation -- I think there are much broader use cases. When >>> >> thinking >>> >>>>> about exposing headers, I'd actually use Replicator as only a min= or >>> >>>>> supporting case. The reason is that it is a very uncommon case >>> where >>> >>>> there >>> >>>>> is zero impedance mismatch between the source and sink of the dat= a >>> >> since >>> >>>>> they are both Kafka. This means you don't need to think much abou= t >>> data >>> >>>>> formats/serialization. I think the JMS use case is a better examp= le >>> >> since >>> >>>>> JMS headers and Kafka headers don't quite match up. Here's a quic= k >>> list >>> >>>> of >>> >>>>> use cases I can think of off the top of my head: >>> >>>>> >>> >>>>> 1. Include headers from other systems that support them: JMS (or >>> really >>> >>>> any >>> >>>>> MQ), HTTP >>> >>>>> 2. Other connector-specific headers. For example, from JDBC maybe >>> the >>> >>>> table >>> >>>>> the data comes from is a header; for a CDC connector you might >>> include >>> >>>> the >>> >>>>> binlog offset as a header. >>> >>>>> 3. Interceptor/SMT-style use cases for annotating things like >>> >> provenance >>> >>>> of >>> >>>>> data: >>> >>>>> 3a. Generically w/ user-supplied data like data center, host, app >>> ID, >>> >>>> etc. >>> >>>>> 3b. Kafka Connect framework level info, such as the connector/tas= k >>> >>>>> generating the data >>> >>>>> >>> >>>>> On deviation from Connect's model -- to be honest, the KIP-82 als= o >>> >>>> deviates >>> >>>>> quite substantially from how Kafka handles data already, so we ma= y >>> >>>> struggle >>> >>>>> a bit to rectify the two. (In particular, headers specify some >>> >> structure >>> >>>>> and enforce strings specifically for header keys, but then requir= e >>> you >>> >> to >>> >>>>> do serialization of header values yourself...). >>> >>>>> >>> >>>>> I think the use cases I mentioned above may also need different >>> >>>> approaches >>> >>>>> to how the data in headers are handled. As Gwen mentions, if we >>> expose >>> >>>> the >>> >>>>> headers to Connectors, they need to have some idea of the format >>> and >>> >> the >>> >>>>> reason for byte[] values in KIP-82 is to leave that decision up t= o >>> the >>> >>>>> organization using them. But without knowing the format, connecto= rs >>> >> can't >>> >>>>> really do anything with them -- if a source connector assumes a >>> format, >>> >>>>> they may generate data incompatible with the format used by the >>> rest of >>> >>>> the >>> >>>>> organization. On the other hand, I have a feeling most people wil= l >>> just >>> >>>> use >>> >>>>> headers, so allowing connectors to embed >>> arbitrarily >>> >>>>> complex data may not work out well in practice. Or maybe we leave >>> it >>> >>>>> flexible, most people default to using StringConverter for the >>> >> serializer >>> >>>>> and Connectors will end up defaulting to that just for >>> compatibility... >>> >>>>> >>> >>>>> I'm not sure I have a real proposal yet, but I do think >>> understanding >>> >> the >>> >>>>> impact of using a Converter for headers would be useful, and we >>> might >>> >>>> want >>> >>>>> to think about how this KIP would fit in with transformations (or >>> if >>> >> that >>> >>>>> is something that can be deferred, handled separately from the >>> existing >>> >>>>> transformations, etc). >>> >>>>> >>> >>>>> -Ewen >>> >>>>> >>> >>>>> On Mon, May 1, 2017 at 11:52 AM, Michael Pearce < >>> Michael.Pearce@ig.com >>> >>> >>> >>>>> wrote: >>> >>>>> >>> >>>>>> Hi Gwen, >>> >>>>>> >>> >>>>>> Then intent here was to allow tools that perform similar role to >>> >> mirror >>> >>>>>> makers of replicating the messaging from one cluster to another. >>> Eg >>> >>>> like >>> >>>>>> mirror make should just be taking and transferring the headers a= s >>> is. >>> >>>>>> >>> >>>>>> We don't actually use this inside our company, so not exposing >>> this >>> >>>> isn't >>> >>>>>> an issue for us. Just believe there are companies like confluent >>> who >>> >>>> have >>> >>>>>> tools like replicator that do. >>> >>>>>> >>> >>>>>> And as good citizens think we should complete the work and expos= e >>> the >>> >>>>>> headers same as in the record to at least allow them to replicat= e >>> the >>> >>>>>> messages as is. Note Steph seems to want it. >>> >>>>>> >>> >>>>>> Cheers >>> >>>>>> Mike >>> >>>>>> >>> >>>>>> Sent using OWA for iPhone >>> >>>>>> ________________________________________ >>> >>>>>> From: Gwen Shapira >>> >>>>>> Sent: Monday, May 1, 2017 2:36:34 PM >>> >>>>>> To: dev@kafka.apache.org >>> >>>>>> Subject: Re: [DISCUSS] KIP 145 - Expose Record Headers in Kafka >>> >> Connect >>> >>>>>> >>> >>>>>> Hi, >>> >>>>>> >>> >>>>>> I'm excited to see the community expanding Connect in this >>> direction! >>> >>>>>> Headers + Transforms =3D=3D Fun message routing. >>> >>>>>> >>> >>>>>> I like how clean the proposal is, but I'm concerned that it kind= a >>> >>>> deviates >>> >>>>>> from how Connect handles data elsewhere. >>> >>>>>> Unlike Kafka, Connect doesn't look at all data as byte-arrays, w= e >>> have >>> >>>>>> converters that take data in specific formats (JSON, Avro) and >>> turns >>> >> it >>> >>>>>> into Connect data types (defined in the data api). I think it >>> will be >>> >>>> more >>> >>>>>> consistent for connector developers to also get headers as some >>> kind >>> >> of >>> >>>>>> structured or semi-structured data (and to expand the converters >>> to >>> >>>> handle >>> >>>>>> header conversions as well). >>> >>>>>> This will allow for Connect's separation of concerns - Connector >>> >>>> developers >>> >>>>>> don't worry about data formats (because they get the internal >>> connect >>> >>>>>> objects) and Converters do all the data format work. >>> >>>>>> >>> >>>>>> Another thing, in my experience, APIs work better if they are pu= t >>> into >>> >>>> use >>> >>>>>> almost immediately - so difficulties in using the APIs are >>> immediately >>> >>>>>> surfaced. Are you planning any connectors that will use this >>> feature >>> >>>> (not >>> >>>>>> necessarily in Kafka, just in general)? Or perhaps we can think >>> of a >>> >>>> way to >>> >>>>>> expand Kafka's file connectors so they'll use headers somehow >>> (can't >>> >>>> think >>> >>>>>> of anything, but maybe?). >>> >>>>>> >>> >>>>>> Gwen >>> >>>>>> >>> >>>>>> On Sat, Apr 29, 2017 at 12:12 AM, Michael Pearce < >>> >> Michael.Pearce@ig.com >>> >>>>> >>> >>>>>> wrote: >>> >>>>>> >>> >>>>>>> Hi All, >>> >>>>>>> >>> >>>>>>> Now KIP-82 is committed I would like to discuss extending the >>> work to >>> >>>>>>> expose it in Kafka Connect, its primary focus being so connecto= rs >>> >> that >>> >>>>>> may >>> >>>>>>> do similar tasks as MirrorMakers, either Kafka->Kafka or >>> JMS-Kafka >>> >>>> would >>> >>>>>> be >>> >>>>>>> able to replicate the headers. >>> >>>>>>> It would be ideal but not mandatory for this to go in 0.11 >>> release so >>> >>>> is >>> >>>>>>> available on day one of headers being available. >>> >>>>>>> >>> >>>>>>> Please find the KIP here: >>> >>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- >>> >>>>>>> 145+-+Expose+Record+Headers+in+Kafka+Connect >>> >>>>>>> >>> >>>>>>> Please find an initial implementation as a PR here: >>> >>>>>>> https://github.com/apache/kafka/pull/2942 >>> >>>>>>> >>> >>>>>>> Kind Regards >>> >>>>>>> Mike >>> >>>>>>> The information contained in this email is strictly confidentia= l >>> and >>> >>>> for >>> >>>>>>> the use of the addressee only, unless otherwise indicated. If >>> you are >>> >>>> not >>> >>>>>>> the intended recipient, please do not read, copy, use or >>> disclose to >>> >>>>>> others >>> >>>>>>> this message or any attachment. Please also notify the sender b= y >>> >>>> replying >>> >>>>>>> to this email or by telephone (+44(020 7896 0011) and then >>> delete the >>> >>>>>> email >>> >>>>>>> and any copies of it. Opinions, conclusion (etc) that do not >>> relate >>> >> to >>> >>>>>> the >>> >>>>>>> official business of this company shall be understood as neithe= r >>> >> given >>> >>>>>> nor >>> >>>>>>> endorsed by it. IG is a trading name of IG Markets Limited (a >>> company >>> >>>>>>> registered in England and Wales, company number 04008957) and I= G >>> >> Index >>> >>>>>>> Limited (a company registered in England and Wales, company >>> number >>> >>>>>>> 01190902). Registered address at Cannon Bridge House, 25 Dowgat= e >>> >> Hill, >>> >>>>>>> London EC4R 2YA. Both IG Markets Limited (register number >>> 195355) and >>> >>>> IG >>> >>>>>>> Index Limited (register number 114059) are authorised and >>> regulated >>> >> by >>> >>>>>> the >>> >>>>>>> Financial Conduct Authority. >>> >>>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> >>> >>>>>> -- >>> >>>>>> *Gwen Shapira* >>> >>>>>> Product Manager | Confluent >>> >>>>>> 650.450.2760 <(650)%20450-2760> | @gwenshap >>> >>>>>> Follow us: Twitter | blog >>> >>>>>> >>> >>>>>> The information contained in this email is strictly confidential >>> and >>> >> for >>> >>>>>> the use of the addressee only, unless otherwise indicated. If yo= u >>> are >>> >>>> not >>> >>>>>> the intended recipient, please do not read, copy, use or disclos= e >>> to >>> >>>> others >>> >>>>>> this message or any attachment. Please also notify the sender by >>> >>>> replying >>> >>>>>> to this email or by telephone (+44(020 7896 0011) and then delet= e >>> the >>> >>>> email >>> >>>>>> and any copies of it. Opinions, conclusion (etc) that do not >>> relate to >>> >>>> the >>> >>>>>> official business of this company shall be understood as neither >>> given >>> >>>> nor >>> >>>>>> endorsed by it. IG is a trading name of IG Markets Limited (a >>> company >>> >>>>>> registered in England and Wales, company number 04008957) and IG >>> Index >>> >>>>>> Limited (a company registered in England and Wales, company numb= er >>> >>>>>> 01190902). Registered address at Cannon Bridge House, 25 Dowgate >>> Hill, >>> >>>>>> London EC4R 2YA. Both IG Markets Limited (register number 195355= ) >>> and >>> >> IG >>> >>>>>> Index Limited (register number 114059) are authorised and >>> regulated by >>> >>>> the >>> >>>>>> Financial Conduct Authority. >>> >>>>>> >>> >>>> >>> >>> >>> >>> >>> >>> The information contained in this email is strictly confidential an= d >>> for >>> >> the use of the addressee only, unless otherwise indicated. If you ar= e >>> not >>> >> the intended recipient, please do not read, copy, use or disclose to >>> others >>> >> this message or any attachment. Please also notify the sender by >>> replying >>> >> to this email or by telephone (+44(020 7896 0011) and then delete th= e >>> email >>> >> and any copies of it. Opinions, conclusion (etc) that do not relate >>> to the >>> >> official business of this company shall be understood as neither >>> given nor >>> >> endorsed by it. IG is a trading name of IG Markets Limited (a compan= y >>> >> registered in England and Wales, company number 04008957) and IG Ind= ex >>> >> Limited (a company registered in England and Wales, company number >>> >> 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hil= l, >>> >> London EC4R 2YA. Both IG Markets Limited (register number 195355) an= d >>> IG >>> >> Index Limited (register number 114059) are authorised and regulated >>> by the >>> >> Financial Conduct Authority. >>> >> >>> >> >> > --94eb2c1e0278aac572056141df86--