kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Satish Duggana <satish.dugg...@gmail.com>
Subject Re: [DISCUSS] KIP-482: The Kafka Protocol should Support Optional Fields
Date Mon, 19 Aug 2019 04:16:07 GMT
Sorry! Colin, I may not have been clear in my earlier query about
optional field type restriction. It is mentioned in one of your
replies "optional fields are serialized starting with their total
length". This brings the question of whether optional fields support
struct types (with or without array values). It seems struct types are
currently not serialized with total length. I may be missing something
here.

Thanks,
Satish.


On Wed, Aug 14, 2019 at 8:03 AM Satish Duggana <satish.duggana@gmail.com> wrote:
>
> Hi Colin,
> Thanks for the KIP. Optional fields and var length encoding support is a great
> improvement for the protocol.
>
> >>Optional fields can have any type, except that they cannot be arrays.
> Note that the restriction against having tagged arrays is just to simplify
> serialization.  We can relax this restriction in the future without changing
> the protocol on the wire.
>
> Can an Optional field have a struct type which internally contains an array
> field at any level?
>
> Thanks,
> Satish.
>
>
>
> On Tue, Aug 13, 2019 at 11:49 PM David Jacot <djacot@confluent.io> wrote:
> >
> > Hi Colin,
> >
> > Thank you for the KIP! Things are well explained!. It is huge improvement
> > for the Kafka protocol. I have few comments on the proposal:
> >
> > 1. The interleaved tag/length header sounds like a great optimisation as it
> > would be shorter on average. The downside, as
> > you already pointed out, is that it makes the decoding and the specs more
> > complex. Personally, I would also favour using two
> > vaints in this particular case to keep things simple.
> >
> > 2. As discussed, I wonder if it would make sense to extend to KIP to also
> > support optional fields in the Record Header. I think
> > that it could be interesting to have such capability for common fields
> > across all the requests or responses (e.g. tracing id).
> >
> > Regards,
> > David
> >
> >
> >
> > On Tue, Aug 13, 2019 at 10:00 AM Jason Gustafson <jason@confluent.io> wrote:
> >
> > > > Right, I was planning on doing exactly that for all the auto-generated
> > > RPCs. For the manual RPCs, it would be a lot of work. It’s probably a
> > > better use of time to convert the manual ones to auto gen first (with the
> > > possible exception of Fetch/Produce, where the ROI may be higher for the
> > > manual work)
> > >
> > > Yeah, that makes sense. Maybe we can include the version bump for all RPCs
> > > in this KIP, but we can implement it lazily as the protocols are converted.
> > >
> > > -Jason
> > >
> > > On Mon, Aug 12, 2019 at 7:16 PM Colin McCabe <cmccabe@apache.org> wrote:
> > >
> > > > On Mon, Aug 12, 2019, at 11:22, Jason Gustafson wrote:
> > > > > Hi Colin,
> > > > >
> > > > > Thanks for the KIP! This is a significant improvement. One of my
> > > personal
> > > > > interests in this proposal is solving the compatibility problems
we
> > > have
> > > > > with the internal schemas used to define consumer offsets and
> > > transaction
> > > > > metadata. Currently we have to guard schema bumps with the inter-broker
> > > > > protocol format. Once the format is bumped, there is no way to
> > > downgrade.
> > > > > By fixing this, we can potentially begin using the new schemas before
> > > the
> > > > > IBP is bumped while still allowing downgrade.
> > > > >
> > > > > There are a surprising number of other situations we have encountered
> > > > this
> > > > > sort of problem. We have hacked around it in special cases by allowing
> > > > > nullable fields to the end of the schema, but this is not really
an
> > > > > extensible approach. I'm looking forward to having a better option.
> > > >
> > > > Yeah, this problem keeps coming up.
> > > >
> > > > >
> > > > > With that said, I have a couple questions on the proposal:
> > > > >
> > > > > 1. For each request API, we need one version bump to begin support
for
> > > > > "flexible versions." Until then, we won't have the option of using
> > > tagged
> > > > > fields even if the broker knows how to handle them. Does it make
sense
> > > to
> > > > > go ahead and do a universal bump of each request API now so that
we'll
> > > > have
> > > > > this option going forward?
> > > >
> > > > Right, I was planning on doing exactly that for all the auto-generated
> > > > RPCs. For the manual RPCs, it would be a lot of work. It’s probably
a
> > > > better use of time to convert the manual ones to auto gen first (with
the
> > > > possible exception of Fetch/Produce, where the ROI may be higher for the
> > > > manual work)
> > > >
> > > > > 2. The alternating length/tag header encoding lets us save a byte
in
> > > the
> > > > > common case. The downside is that it's a bit more complex to specify.
> > > It
> > > > > also has some extra cost if the length exceeds the tag substantially.
> > > > > Basically we'd have to pad the tag, right? I think I'm wondering
if we
> > > > > should just bite the bullet and use two varints instead.
> > > >
> > > > That’s a fair point. It would be shorter on average, but worse for some
> > > > exceptional cases. Also, the decoding would be more complex, which might
> > > be
> > > > a good reason to go for just having two varints. Yeah, let’s simplify.
> > > >
> > > > Regards,
> > > > Colin
> > > >
> > > > >
> > > > > Thanks,
> > > > > Jason
> > > > >
> > > > >
> > > > > On Fri, Aug 9, 2019 at 4:31 PM Colin McCabe <cmccabe@apache.org>
> > > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > I've made some updates to this KIP. Specifically, I wanted to
avoid
> > > > > > including escape bytes in the serialization format, since it
was too
> > > > > > complex. Also, I think this is a good opportunity to slim down
our
> > > > > > variable length fields.
> > > > > >
> > > > > > best,
> > > > > > Colin
> > > > > >
> > > > > >
> > > > > > On Thu, Jul 11, 2019, at 20:52, Colin McCabe wrote:
> > > > > > > On Tue, Jul 9, 2019, at 15:29, Jose Armando Garcia Sancio
wrote:
> > > > > > > > Thanks Colin for the KIP. For my own edification why
are we doing
> > > > this
> > > > > > > > "Optional fields can have any type, except for an
array of
> > > > > > structures."?
> > > > > > > > Why can't we have an array of structures?
> > > > > > >
> > > > > > > Optional fields are serialized starting with their total
length.
> > > This
> > > > > > > is straightforward to calculate for primitive fields like
INT32,
> > > (or
> > > > > > > even an array of INT32), but more difficult to calculate
for an
> > > array
> > > > > > > of structures. Basically, we'd have to do a two-pass serialization
> > > > > > > where we first calculate the lengths of everything, and
then write
> > > it
> > > > > > > out.
> > > > > > >
> > > > > > > The nice thing about this KIP is that there's nothing in
the
> > > protocol
> > > > > > > stopping us from adding support for this feature in the
future. We
> > > > > > > wouldn't have to really change the protocol at all to add
support.
> > > > But
> > > > > > > we'd have to change a lot of serialization code. Given
almost all
> > > of
> > > > > > > our use-cases for optional fields are adding an extra field
here or
> > > > > > > there, it seems reasonable not to support it for right
now.
> > > > > > >
> > > > > > > best,
> > > > > > > Colin
> > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > -Jose
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >

Mime
View raw message