From dev-return-107422-archive-asf-public=cust-asf.ponee.io@kafka.apache.org Thu Sep 12 09:04:01 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id A615D180651 for ; Thu, 12 Sep 2019 11:04:00 +0200 (CEST) Received: (qmail 51068 invoked by uid 500); 12 Sep 2019 09:03:59 -0000 Mailing-List: contact dev-help@kafka.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@kafka.apache.org Delivered-To: mailing list dev@kafka.apache.org Received: (qmail 51051 invoked by uid 99); 12 Sep 2019 09:03:59 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 12 Sep 2019 09:03:59 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id 8714A180D05 for ; Thu, 12 Sep 2019 09:03:58 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -3.001 X-Spam-Level: X-Spam-Status: No, score=-3.001 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=2, RCVD_IN_DNSWL_HI=-5, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Received: from mx1-ec2-va.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id RPhgjeXKi2aC for ; Thu, 12 Sep 2019 09:03:55 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.132.183.28; helo=mx1.redhat.com; envelope-from=tbentley@redhat.com; receiver= Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by mx1-ec2-va.apache.org (ASF Mail Server at mx1-ec2-va.apache.org) with ESMTPS id 593EFBC553 for ; Thu, 12 Sep 2019 09:03:55 +0000 (UTC) Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id A49A981F0C for ; Thu, 12 Sep 2019 09:03:53 +0000 (UTC) Received: by mail-wr1-f70.google.com with SMTP id s5so11660537wrv.23 for ; Thu, 12 Sep 2019 02:03:53 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=QSitoNq5sRPfE+0Lv9MuOMDZsZzcpORODgrlX+Foh3s=; b=TaNCpgPEMpR3ayRkRMJzyMsosLTyBJq8w2R+MPEbabdc0aNA5i81QLjau/hlottt+y 3jjmtOGWYlkLOC0ncyg+EMjD2sLhyIHFs1NFgKGVBGy7gd0e+FXNdt6Q80FybvvjkBGl jENW0D68Bd4ic/jE74suNMwPcUBuuu6BbyPkuT/hgbvhLJNiDaj55JgfsHbOxWWESUv6 PbjwYFRcOEzt9/fbJEQyNVEaYV0yKrbTZUd4bGvVc9cOFjdU1Vavr8bYGPtGXkq4SmrQ lXDUSmpbyxwbQlOKALYdcmTysI5QsNwYew13ewf6oQOx6yOTvGEys9Z9g0j3AvOwI1GU uM5Q== X-Gm-Message-State: APjAAAWOJXxYXFpj8D0b61PxL8m3sdwzemaXiRsJH5HVPlsGt1whJpb9 z6Ju+Y4Mt+DKGoGYaIwaArS1CVgEdir3U8SJU5ZyQT3Uke7mN+Q+DU+ILLNnKHoKHkiXiwpl6Fj VhAyOTo+V1IqURmbWxFKwcgnDZlpmOg== X-Received: by 2002:adf:ef49:: with SMTP id c9mr15608242wrp.122.1568279032053; Thu, 12 Sep 2019 02:03:52 -0700 (PDT) X-Google-Smtp-Source: APXvYqz5Nmo+cxK6aEMg6//qP9YFTf5vERERX7o0GAW3csdEkxeb86dmPdcSws3+gwgVSvIjHnjdgbSDcJ4E209jyIM= X-Received: by 2002:adf:ef49:: with SMTP id c9mr15608216wrp.122.1568279031754; Thu, 12 Sep 2019 02:03:51 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Tom Bentley Date: Thu, 12 Sep 2019 10:03:40 +0100 Message-ID: Subject: Re: [VOTE] KIP-520: Augment Consumer.committed(partition) to allow multiple partitions To: dev@kafka.apache.org Content-Type: multipart/alternative; boundary="00000000000098c9350592576a5d" --00000000000098c9350592576a5d Content-Type: text/plain; charset="UTF-8" +1 (non-binding). Thanks! On Thu, Sep 12, 2019 at 9:58 AM M. Manna wrote: > Gushing, > > Good kip. +1 (binding) from me. > > Thanks, > MAnna > > On Thu, 12 Sep 2019 at 09:55, Bruno Cadonna wrote: > > > Guozhang, Thanks for the KIP. > > > > +1 (non-binding) > > > > Best, > > Bruno > > > > On Wed, Sep 11, 2019 at 9:17 AM Kamal Chandraprakash > > wrote: > > > > > > Thanks for the KIP! > > > > > > LGTM, +1 (non-binding). > > > > > > On Wed, Sep 11, 2019 at 3:23 AM Matthias J. Sax > > > > wrote: > > > > > > > I don't have a strong preference. So I am also fine to deprecate the > > > > existing methods. Let's see what Jason thinks. > > > > > > > > Can you update the KIP to reflect the semantics of the return `Map` > > (ie, > > > > does only contain entries for partitions with committed offsets, and > > > > does not contain `null` values)? > > > > > > > > > > > > +1 (binding) > > > > > > > > -Matthias > > > > > > > > > > > > > > > > > > > > On 9/10/19 11:53 AM, Guozhang Wang wrote: > > > > > Hi Jason / Matthias, > > > > > > > > > > I understand your concerns now. Just to clarify my main motivation > on > > > > > deprecating the old APIs is not only for aesthetics (confess I > > personally > > > > > do have preference on succinct APIs), but to urge people to use the > > > > batched > > > > > API for better latency when possible --- as stated in the KIP, my > > > > > observation is that in practice most callers are actually going to > > get > > > > > committed offsets for more than one partitions, and without > > deprecating > > > > the > > > > > old APIs it would be hard for them to realize that the new API does > > have > > > > a > > > > > benefit in performance. > > > > > > > > > > This is different from some of the existing APIs though -- e.g., > > Matthias > > > > > mentioned about seek / seekToBeginning / seekToEnd, where only > > seekToXX > > > > > have plurals and seek only have singulars. We can, of course, make > > > > seekToXX > > > > > with plurals as well just like commitSync/Async, but since seeks > are > > > > > non-blocking APIs (they rely on the follow-up polling call to talk > > to the > > > > > brokers) either calling it multiple times with one partition each > > v.s. > > > > > calling it one time with a plural makes little difference (still, > I'd > > > > argue > > > > > that today we do not have a same-named function overloaded with > both > > > > types > > > > > :P) On the other hand, committed, commitSync, offsetsForTimes etc > > > > blocking > > > > > calls are all in the form of plurals except > > > > > > > > > > * committed > > > > > * position > > > > > * partitionsFor > > > > > > > > > > My rationale was that 1) for consecutive calls of #position, mostly > > it > > > > > would only require a single round-trip to brokers since we are > > trying to > > > > > refresh fetching positions for all partitions anyways, and 2) for > > > > > #partitionsFor, theoretically we could also consider to ask for > > multiple > > > > > topics in one call since each blocking call potentially incurs one > > round > > > > > trip, but I did not include it in the scope of this KIP only > because > > I > > > > have > > > > > not observed too many usage patterns that are commonly calling it > > > > > consecutively for multiple topics. At the moment, what I truly want > > to > > > > > "improve" on is the committed calls, as in many cases I've seen it > > being > > > > > called consecutively for multiple topic-partitions. > > > > > > > > > > Therefore, I'm still more inclined to deprecate the old APIs so > that > > we > > > > can > > > > > enforce people to discover the new batching APIs for efficiency in > > this > > > > > KIP. But if you feel that this compatibility is very crucial to > > maintain > > > > I > > > > > could be convinced. > > > > > > > > > > > > > > > Guozhang > > > > > > > > > > On Tue, Sep 10, 2019 at 10:18 AM Matthias J. Sax < > > matthias@confluent.io> > > > > > wrote: > > > > > > > > > >> Thanks for the KIP Guozhang. > > > > >> > > > > >>> Another reason is that other functions of KafkaConsumers do not > > have > > > > >> those > > > > >>> overloaded functions to be consistent > > > > >> > > > > >> I tend to agree with Jason about keeping the existing methods. > Your > > > > >> argument does not seem to hold. I just checked the `Consumer` API, > > and > > > > >> it's mix of overloads: > > > > >> > > > > >> Methods only talking `Collections` > > > > >> > > > > >> > > > > >> > > > > > > > subscribe/assign/commitSync/commitAsyn/pause/resume/offsetsForTimes/beginningOffsets/endOffsets > > > > >> > > > > >> Method with overload taking `Collections` or as single value: > > > > >> > > > > >> seek/seekToBeginning/seekToEnd > > > > >> > > > > >> (those are strictly different methods, but they are semantically > > > > related) > > > > >> > > > > >> Only talking single value: > > > > >> > > > > >> position/committed/partitionsFor > > > > >> > > > > >> > > > > >> While you are strictly speaking correct, that there is no method > > with an > > > > >> overload for `Collection` and single value, the API mix seems to > > suggest > > > > >> that it might actually be worth to have corresponding overloads > for > > all > > > > >> methods instead of sticking to `Collections` only. > > > > >> > > > > >> > > > > >> > > > > >> About the return map: I agree that not containing any entry in the > > map > > > > >> is better. It's not only consistent with other APIs but it also > > avoids > > > > >> potential NPEs. > > > > >> > > > > >> > > > > >> > > > > >> -Matthias > > > > >> > > > > >> > > > > >> On 9/10/19 10:04 AM, Jason Gustafson wrote: > > > > >>>> I feel it not worth making committed to have both plurals and > > > > >> singulars. > > > > >>> > > > > >>> Not sure I agree. If we had started with these new APIs from the > > > > >> beginning, > > > > >>> that may have been better, but we already have exposed the > singular > > > > APIs > > > > >>> and users are depending on them. Not sure it's worth breaking > > > > >> compatibility > > > > >>> just for aesthetics. > > > > >>> > > > > >>> -Jason > > > > >>> > > > > >>> On Tue, Sep 10, 2019 at 9:41 AM Guozhang Wang < > wangguoz@gmail.com> > > > > >> wrote: > > > > >>> > > > > >>>> Thanks Jason! > > > > >>>> > > > > >>>> On Tue, Sep 10, 2019 at 9:07 AM Jason Gustafson < > > jason@confluent.io> > > > > >>>> wrote: > > > > >>>> > > > > >>>>> Hi Guozhang, > > > > >>>>> > > > > >>>>> I think the motivation for the new API makes sense. I've wanted > > > > >> something > > > > >>>>> like this in the past. That said, do you think there is a > > substantial > > > > >>>>> benefit from deprecating the old API? I can still see it being > > > > >> convenient > > > > >>>>> in some cases and it's no real cost to maintain. > > > > >>>>> > > > > >>>>> > > > > >>>> That's a good question. > > > > >>>> > > > > >>>> Personally I would like to keep a succinct set of APIs out of > the > > box > > > > >> and > > > > >>>> let users who want more syntax sugars to add themselves as > > extended > > > > >> classes > > > > >>>> for example (KafkaConsumer is not a final class). > > > > >>>> Another reason is that other functions of KafkaConsumers do not > > have > > > > >> those > > > > >>>> overloaded functions to be consistent, e.g. we do not have a > > > > >>>> subscribe(single-topic), pause/resume(single-topic-partition) or > > > > >>>> seekToBeginning(single-topic-partition). I feel it not worth > > making > > > > >>>> committed to have both plurals and singulars. > > > > >>>> > > > > >>>> WDYT? > > > > >>>> > > > > >>>> > > > > >>>>> Also, just a minor detail. If the partition has no committed > > offset, > > > > >> will > > > > >>>>> it be present in the map with a null value? > > > > >>>>> > > > > >>>>> I looked into the admin client's listConsumerGroupOffsets call > > when > > > > >>>> creating the KIP, and to be consistent with that API my > intention > > is > > > > to > > > > >> NOT > > > > >>>> include the entry if a topic-partition does not have committed > > > > offsets. > > > > >>>> That said, if we feel returning an entry with null value is > > better for > > > > >>>> programmability I can also do that (and will update wiki page to > > > > >> clarify as > > > > >>>> well). LMK. > > > > >>>> > > > > >>>> > > > > >>>>> -Jason > > > > >>>>> > > > > >>>>> On Tue, Sep 10, 2019 at 6:09 AM Mickael Maison < > > > > >> mickael.maison@gmail.com > > > > >>>>> > > > > >>>>> wrote: > > > > >>>>> > > > > >>>>>> +1 (non-binding), thanks Guozhang > > > > >>>>>> > > > > >>>>>> On Tue, Sep 10, 2019 at 1:14 AM Boyang Chen < > > > > >>>> reluctanthero104@gmail.com> > > > > >>>>>> wrote: > > > > >>>>>>> > > > > >>>>>>> Hey Guozhang, > > > > >>>>>>> > > > > >>>>>>> LGTM, +1 (non-binding) > > > > >>>>>>> > > > > >>>>>>> On Mon, Sep 9, 2019 at 5:07 PM Guozhang Wang < > > wangguoz@gmail.com> > > > > >>>>> wrote: > > > > >>>>>>> > > > > >>>>>>>> Hello folks, > > > > >>>>>>>> > > > > >>>>>>>> I've created a new KIP allowing consumer.committed to take a > > set > > > > of > > > > >>>>>>>> partitions instead of just one partition to allow batching > > effects > > > > >>>> of > > > > >>>>>> such > > > > >>>>>>>> requests (the protocol already allows us to send multiple > > > > >>>> partitions > > > > >>>>>> in one > > > > >>>>>>>> round-trip): > > > > >>>>>>>> > > > > >>>>>>>> > > > > >>>>>>>> > > > > >>>>>> > > > > >>>>> > > > > >>>> > > > > >> > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-520%3A+Add+overloaded+Consumer%23committed+for+batching+partitions > > > > >>>>>>>> > > > > >>>>>>>> Since it is a pretty straight-forward KIP, I'm starting the > > VOTE > > > > >>>> for > > > > >>>>>> this > > > > >>>>>>>> KIP directly. If there are any suggestions about this > > proposal, > > > > >>>>> please > > > > >>>>>> feel > > > > >>>>>>>> free to share them in this thread. Thank you! > > > > >>>>>>>> > > > > >>>>>>>> > > > > >>>>>>>> -- Guozhang > > > > >>>>>>>> > > > > >>>>>> > > > > >>>>> > > > > >>>> > > > > >>>> > > > > >>>> -- > > > > >>>> -- Guozhang > > > > >>>> > > > > >>> > > > > >> > > > > >> > > > > > > > > > > > > > > > > --00000000000098c9350592576a5d--