kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joel Koshy <jjkosh...@gmail.com>
Subject Re: [DISCUSS] KIP-82 - Add Record Headers
Date Fri, 07 Oct 2016 16:34:17 GMT
Hi Jay,

Couple of comments inline:

One of things that has helped keep Kafka simple is not adding in new
> abstractions and concepts except when the proposal is really elegant and
> makes things simpler.
>

I don't quite see how this impacts simplicity because (per your taxonomy)
the scope is "company" and "world". So the decision to use it is or not is
really up to companies/individuals.


> Consider three use cases for headers:
>
>    1. Kafka-scope: We want to add a feature to Kafka that needs a
>    particular field.
>    2. Company-scope: You want to add a header to be shared by everyone in
>    your company.
>    3. World-wide scope: You are building a third party tool and want to add
>    some kind of header.
>
> For the case of (1) you should not use headers, you should just add a field
> to the record format.


Agreed - which is what we have been doing so far.


> sense. Occasionally people have complained that adding to the record format
> is hard and it would be nice to just shove lots of things in quickly. I

think a better solution would be to make it easy to add to the record
> format, and I think we've made progress on that. I also think we should be
>

The intent is not to shove things in quickly but to make it possible to
augment the streaming system with infrastructure features such as
audit/call tracing and more without invading the record format for all
users who may not need some of those features.


> earlier proposals. These things end up being long term commitments so it's
> really worth being thoughtful.
>

Which is another reason I think it is helpful to have such headers. If a
certain header type is clearly useful for the vast majority of users then
there is a case for it being integrated directly into the record format.
However, there could always be a large segment of users for whom certain
headers are irrelevant and they should have the ability to opt out of it.

For case (2) just use the body of the message. You don't need a globally
> agreed on definition of headers, just standardize on a header you want to
> include in the value in your company.


This works - but as I described in an earlier email in this thread has
drawbacks.

   1. A global registry of numeric keys is super super ugly. This seems
>    silly compared to the Avro (or whatever) header solution which gives
> more
>    compact encoding, rich types, etc.
>

Agreed - I would really like us to avoid the burden of being a registrar.

   2. Using byte arrays for header values means they aren't really
>    interoperable for case (3). E.g. I can't make a UI that displays
> headers,
>    or allow you to set them in config. To work with third party headers,
> the
>    only case I think this really helps, you need the union of all
>    serialization schemes people have used for any tool.
>

I don't quite see why - the user would need to have the suitable
interceptors in their classpath. Headers that it does not understand are
simply ignored.

   3. For case (2) and (3) your key numbers are going to collide like
>    crazy. I don't think a global registry of magic numbers maintained
> either
>    by word of mouth or checking in changes to kafka source is the right
> thing
>    to do.


Agreed (~ point 1 above)


>    4. We are introducing a new serialization primitive which makes fields
>    disappear conditional on the contents of other fields. This breaks the
>    whole serialization/schema system we have today.
>

I don't quite see why this is so.


>    6. This proposes making the ProducerRecord and ConsumerRecord mutable
>    and adding setters and getters (which we try to avoid).
>

This is another part of the proposal I don't really like although there are
use cases where it helps.


> For context on LinkedIn: I set up the system there, but it may have changed
> since i left. The header is maintained with the record schemas in the avro
> schema registry and is required for all records. Essentially all messages
>


> Not allowing teams to chose a data format other than avro was considered a
> feature, not a bug, since the whole point was to be able to share data,
> which doesn't work if every team chooses their own format.
>

It is pretty much the same - not much has changed since you left, but see
my earlier comments on this: http://markmail.org/message/3ln5mruxqfhbewgz
The proposal does not mean it empowers applications to use non-Avro for the
data plane. The feature supports a much clearer separation of the user's
data from typically infra-related headers (which could also be Avro-based).

At this point I think we should focus the discussion not on the specifics
(implementation) of the proposal but on the motivation. Email is fine but
it may be better to discuss in a hangout and circle back to the thread.

Thanks,

Joel


> On Thu, Sep 22, 2016 at 12:31 PM, Michael Pearce <Michael.Pearce@ig.com>
> wrote:
>
> > Hi All,
> >
> >
> > I would like to discuss the following KIP proposal:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 82+-+Add+Record+Headers
> >
> >
> >
> > I have some initial ?drafts of roughly the changes that would be needed.
> > This is no where finalized and look forward to the discussion especially
> as
> > some bits I'm personally in two minds about.
> >
> > https://github.com/michaelandrepearce/kafka/tree/
> kafka-headers-properties
> >
> >
> >
> > Here is a link to a alternative option mentioned in the kip but one i
> > would personally would discard (disadvantages mentioned in kip)
> >
> > https://github.com/michaelandrepearce/kafka/tree/kafka-headers-full?
> >
> >
> > Thanks
> >
> > Mike
> >
> >
> >
> >
> >
> > The information contained in this email is strictly confidential and for
> > the use of the addressee only, unless otherwise indicated. If you are not
> > the intended recipient, please do not read, copy, use or disclose to
> others
> > this message or any attachment. Please also notify the sender by replying
> > to this email or by telephone (+44(020 7896 0011) and then delete the
> email
> > and any copies of it. Opinions, conclusion (etc) that do not relate to
> the
> > official business of this company shall be understood as neither given
> nor
> > endorsed by it. IG is a trading name of IG Markets Limited (a company
> > registered in England and Wales, company number 04008957) and IG Index
> > Limited (a company registered in England and Wales, company number
> > 01190902). Registered address at Cannon Bridge House, 25 Dowgate Hill,
> > London EC4R 2YA. Both IG Markets Limited (register number 195355) and IG
> > Index Limited (register number 114059) are authorised and regulated by
> the
> > Financial Conduct Authority.
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message