kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Bruno Cadonna <br...@confluent.io>
Subject Re: [VOTE] KIP-520: Augment Consumer.committed(partition) to allow multiple partitions
Date Thu, 12 Sep 2019 08:54:57 GMT
Guozhang, Thanks for the KIP.

+1 (non-binding)

Best,
Bruno

On Wed, Sep 11, 2019 at 9:17 AM Kamal Chandraprakash
<kamal.chandraprakash@gmail.com> wrote:
>
> Thanks for the KIP!
>
> LGTM, +1 (non-binding).
>
> On Wed, Sep 11, 2019 at 3:23 AM Matthias J. Sax <matthias@confluent.io>
> wrote:
>
> > I don't have a strong preference. So I am also fine to deprecate the
> > existing methods. Let's see what Jason thinks.
> >
> > Can you update the KIP to reflect the semantics of the return `Map` (ie,
> > does only contain entries for partitions with committed offsets, and
> > does not contain `null` values)?
> >
> >
> > +1 (binding)
> >
> > -Matthias
> >
> >
> >
> >
> > On 9/10/19 11:53 AM, Guozhang Wang wrote:
> > > Hi Jason / Matthias,
> > >
> > > I understand your concerns now. Just to clarify my main motivation on
> > > deprecating the old APIs is not only for aesthetics (confess I personally
> > > do have preference on succinct APIs), but to urge people to use the
> > batched
> > > API for better latency when possible --- as stated in the KIP, my
> > > observation is that in practice most callers are actually going to get
> > > committed offsets for more than one partitions, and without deprecating
> > the
> > > old APIs it would be hard for them to realize that the new API does have
> > a
> > > benefit in performance.
> > >
> > > This is different from some of the existing APIs though -- e.g., Matthias
> > > mentioned about seek / seekToBeginning / seekToEnd, where only seekToXX
> > > have plurals and seek only have singulars. We can, of course, make
> > seekToXX
> > > with plurals as well just like commitSync/Async, but since seeks are
> > > non-blocking APIs (they rely on the follow-up polling call to talk to the
> > > brokers) either calling it multiple times with one partition each v.s.
> > > calling it one time with a plural makes little difference (still, I'd
> > argue
> > > that today we do not have a same-named function overloaded with both
> > types
> > > :P) On the other hand, committed, commitSync, offsetsForTimes etc
> > blocking
> > > calls are all in the form of plurals except
> > >
> > > * committed
> > > * position
> > > * partitionsFor
> > >
> > > My rationale was that 1) for consecutive calls of #position, mostly it
> > > would only require a single round-trip to brokers since we are trying to
> > > refresh fetching positions for all partitions anyways, and 2) for
> > > #partitionsFor, theoretically we could also consider to ask for multiple
> > > topics in one call since each blocking call potentially incurs one round
> > > trip, but I did not include it in the scope of this KIP only because I
> > have
> > > not observed too many usage patterns that are commonly calling it
> > > consecutively for multiple topics. At the moment, what I truly want to
> > > "improve" on is the committed calls, as in many cases I've seen it being
> > > called consecutively for multiple topic-partitions.
> > >
> > > Therefore, I'm still more inclined to deprecate the old APIs so that we
> > can
> > > enforce people to discover the new batching APIs for efficiency in this
> > > KIP. But if you feel that this compatibility is very crucial to maintain
> > I
> > > could be convinced.
> > >
> > >
> > > Guozhang
> > >
> > > On Tue, Sep 10, 2019 at 10:18 AM Matthias J. Sax <matthias@confluent.io>
> > > wrote:
> > >
> > >> Thanks for the KIP Guozhang.
> > >>
> > >>> Another reason is that other functions of KafkaConsumers do not have
> > >> those
> > >>> overloaded functions to be consistent
> > >>
> > >> I tend to agree with Jason about keeping the existing methods. Your
> > >> argument does not seem to hold. I just checked the `Consumer` API, and
> > >> it's mix of overloads:
> > >>
> > >> Methods only talking `Collections`
> > >>
> > >>
> > >>
> > subscribe/assign/commitSync/commitAsyn/pause/resume/offsetsForTimes/beginningOffsets/endOffsets
> > >>
> > >> Method with overload taking `Collections` or as single value:
> > >>
> > >> seek/seekToBeginning/seekToEnd
> > >>
> > >> (those are strictly different methods, but they are semantically
> > related)
> > >>
> > >> Only talking single value:
> > >>
> > >> position/committed/partitionsFor
> > >>
> > >>
> > >> While you are strictly speaking correct, that there is no method with an
> > >> overload for `Collection` and single value, the API mix seems to suggest
> > >> that it might actually be worth to have corresponding overloads for all
> > >> methods instead of sticking to `Collections` only.
> > >>
> > >>
> > >>
> > >> About the return map: I agree that not containing any entry in the map
> > >> is better. It's not only consistent with other APIs but it also avoids
> > >> potential NPEs.
> > >>
> > >>
> > >>
> > >> -Matthias
> > >>
> > >>
> > >> On 9/10/19 10:04 AM, Jason Gustafson wrote:
> > >>>>  I feel it not worth making committed to have both plurals and
> > >> singulars.
> > >>>
> > >>> Not sure I agree. If we had started with these new APIs from the
> > >> beginning,
> > >>> that may have been better, but we already have exposed the singular
> > APIs
> > >>> and users are depending on them. Not sure it's worth breaking
> > >> compatibility
> > >>> just for aesthetics.
> > >>>
> > >>> -Jason
> > >>>
> > >>> On Tue, Sep 10, 2019 at 9:41 AM Guozhang Wang <wangguoz@gmail.com>
> > >> wrote:
> > >>>
> > >>>> Thanks Jason!
> > >>>>
> > >>>> On Tue, Sep 10, 2019 at 9:07 AM Jason Gustafson <jason@confluent.io>
> > >>>> wrote:
> > >>>>
> > >>>>> Hi Guozhang,
> > >>>>>
> > >>>>> I think the motivation for the new API makes sense. I've wanted
> > >> something
> > >>>>> like this in the past. That said, do you think there is a substantial
> > >>>>> benefit from deprecating the old API? I can still see it being
> > >> convenient
> > >>>>> in some cases and it's no real cost to maintain.
> > >>>>>
> > >>>>>
> > >>>> That's a good question.
> > >>>>
> > >>>> Personally I would like to keep a succinct set of APIs out of the
box
> > >> and
> > >>>> let users who want more syntax sugars to add themselves as extended
> > >> classes
> > >>>> for example (KafkaConsumer is not a final class).
> > >>>> Another reason is that other functions of KafkaConsumers do not
have
> > >> those
> > >>>> overloaded functions to be consistent, e.g. we do not have a
> > >>>> subscribe(single-topic), pause/resume(single-topic-partition) or
> > >>>> seekToBeginning(single-topic-partition). I feel it not worth making
> > >>>> committed to have both plurals and singulars.
> > >>>>
> > >>>> WDYT?
> > >>>>
> > >>>>
> > >>>>> Also, just a minor detail. If the partition has no committed
offset,
> > >> will
> > >>>>> it be present in the map with a null value?
> > >>>>>
> > >>>>> I looked into the admin client's listConsumerGroupOffsets call
when
> > >>>> creating the KIP, and to be consistent with that API my intention
is
> > to
> > >> NOT
> > >>>> include the entry if a topic-partition does not have committed
> > offsets.
> > >>>> That said, if we feel returning an entry with null value is better
for
> > >>>> programmability I can also do that (and will update wiki page to
> > >> clarify as
> > >>>> well). LMK.
> > >>>>
> > >>>>
> > >>>>> -Jason
> > >>>>>
> > >>>>> On Tue, Sep 10, 2019 at 6:09 AM Mickael Maison <
> > >> mickael.maison@gmail.com
> > >>>>>
> > >>>>> wrote:
> > >>>>>
> > >>>>>> +1 (non-binding), thanks Guozhang
> > >>>>>>
> > >>>>>> On Tue, Sep 10, 2019 at 1:14 AM Boyang Chen <
> > >>>> reluctanthero104@gmail.com>
> > >>>>>> wrote:
> > >>>>>>>
> > >>>>>>> Hey Guozhang,
> > >>>>>>>
> > >>>>>>> LGTM, +1 (non-binding)
> > >>>>>>>
> > >>>>>>> On Mon, Sep 9, 2019 at 5:07 PM Guozhang Wang <wangguoz@gmail.com>
> > >>>>> wrote:
> > >>>>>>>
> > >>>>>>>> Hello folks,
> > >>>>>>>>
> > >>>>>>>> I've created a new KIP allowing consumer.committed
to take a set
> > of
> > >>>>>>>> partitions instead of just one partition to allow
batching effects
> > >>>> of
> > >>>>>> such
> > >>>>>>>> requests (the protocol already allows us to send
multiple
> > >>>> partitions
> > >>>>>> in one
> > >>>>>>>> round-trip):
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-520%3A+Add+overloaded+Consumer%23committed+for+batching+partitions
> > >>>>>>>>
> > >>>>>>>> Since it is a pretty straight-forward KIP, I'm
starting the VOTE
> > >>>> for
> > >>>>>> this
> > >>>>>>>> KIP directly. If there are any suggestions about
this proposal,
> > >>>>> please
> > >>>>>> feel
> > >>>>>>>> free to share them in this thread. Thank you!
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> -- Guozhang
> > >>>>>>>>
> > >>>>>>
> > >>>>>
> > >>>>
> > >>>>
> > >>>> --
> > >>>> -- Guozhang
> > >>>>
> > >>>
> > >>
> > >>
> > >
> >
> >

Mime
View raw message