From dev-return-107385-archive-asf-public=cust-asf.ponee.io@kafka.apache.org Wed Sep 11 07:17:34 2019 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [207.244.88.153]) by mx-eu-01.ponee.io (Postfix) with SMTP id 4971E18063F for ; Wed, 11 Sep 2019 09:17:34 +0200 (CEST) Received: (qmail 1598 invoked by uid 500); 11 Sep 2019 07:17:32 -0000 Mailing-List: contact dev-help@kafka.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@kafka.apache.org Delivered-To: mailing list dev@kafka.apache.org Received: (qmail 1586 invoked by uid 99); 11 Sep 2019 07:17:32 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Sep 2019 07:17:32 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 013851A333A for ; Wed, 11 Sep 2019 07:17:32 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.8 X-Spam-Level: ** X-Spam-Status: No, score=2.8 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_REPLY=1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-ec2-va.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id dtkgjdnl0CE9 for ; Wed, 11 Sep 2019 07:17:29 +0000 (UTC) Received-SPF: Pass (mailfrom) identity=mailfrom; client-ip=209.85.221.44; helo=mail-wr1-f44.google.com; envelope-from=kamal.chandraprakash@gmail.com; receiver= Received: from mail-wr1-f44.google.com (mail-wr1-f44.google.com [209.85.221.44]) by mx1-ec2-va.apache.org (ASF Mail Server at mx1-ec2-va.apache.org) with ESMTPS id E0CFBBC626 for ; Wed, 11 Sep 2019 07:17:28 +0000 (UTC) Received: by mail-wr1-f44.google.com with SMTP id u16so23269694wrr.0 for ; Wed, 11 Sep 2019 00:17:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=v0oVi+HJZnLefUXyJsBwdK9I89EuUA/b4y4+aEE+Pes=; b=ppdXtsxCXYLA+rQpA+CH13NbFFNOmyuUKKqobzKelw2PhIVpseLDkDx62lVmbjIiM4 CariJhB4yizuURt+jDhS4CpqKZjJtMGIwkYjtjHcYmKGq8HW1vbaIc4HJv56V7aSdsJw X/frgjM8jd/l8Jisory3+0FQMgFg2qSV3o+cpZKuZU+CxtTLO/n50yzXPdqR8ctS2vyz fxMyiHw+US+pSDO+OTIpz4e5RK9KTbMLNsP/PTIkzXU9NttRKWG5aONjAaawHdAeM821 v5wSfS2Tym5bL8yM0Ur1CuPDKQ3JJgZb4t5pjm0SE/Qj3647/fCMJiEWP7O1Y9COgHsI /6EA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=v0oVi+HJZnLefUXyJsBwdK9I89EuUA/b4y4+aEE+Pes=; b=Xidu3VLvuTOkmPVjxbrkWIt8KCGeLSW5b2sZ83/P2+KsdGBhJrJIsSOW1LOSgqwpR4 zgFEllAgFfAB1viNV87VTwWRYOlfuWrykOJAlaZjbpv1IsO7+g9ePTmCeIFmsMy/vPM7 zA3J9mU+j88Cs0fIvKyHK1ZvF16x4H8YXFU7ztp+GqBH5oM7GCnJoNVs3RIN87ehukOQ HPmFcxSubw/g0S3zWWRC51vZMh01BVRnz1S8GW3r6IbGGgijsY/wrpF0F7SjJ7ExqYfA utNSjP9df9LLPuO+ATC8JpW4ixFom8lF/6U1HX1M0qQaoHf14fG6gqS+sJNdQ1lsneeN NRUg== X-Gm-Message-State: APjAAAUPm659MyCJv2U4iTuT5GoCpLFKNutwWL+7S1Qf/DsNAKwn6mvo Npqr1id2rD1pXNKXJRVo1y9iYPSQemPug7EPGfJHSw== X-Google-Smtp-Source: APXvYqz/EZ5aTo4koI/RWNRi3vSYMSIKA9iTWWp7OEu+jXwbmBRmKnJQ0AESaJ8nhP48V66adZDa/n+PHomA76Snw2g= X-Received: by 2002:adf:e388:: with SMTP id e8mr920000wrm.306.1568186247719; Wed, 11 Sep 2019 00:17:27 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Kamal Chandraprakash Date: Wed, 11 Sep 2019 12:46:51 +0530 Message-ID: Subject: Re: [VOTE] KIP-520: Augment Consumer.committed(partition) to allow multiple partitions To: dev@kafka.apache.org Content-Type: multipart/alternative; boundary="0000000000003cadfb059241d03d" --0000000000003cadfb059241d03d Content-Type: text/plain; charset="UTF-8" Thanks for the KIP! LGTM, +1 (non-binding). On Wed, Sep 11, 2019 at 3:23 AM Matthias J. Sax wrote: > I don't have a strong preference. So I am also fine to deprecate the > existing methods. Let's see what Jason thinks. > > Can you update the KIP to reflect the semantics of the return `Map` (ie, > does only contain entries for partitions with committed offsets, and > does not contain `null` values)? > > > +1 (binding) > > -Matthias > > > > > On 9/10/19 11:53 AM, Guozhang Wang wrote: > > Hi Jason / Matthias, > > > > I understand your concerns now. Just to clarify my main motivation on > > deprecating the old APIs is not only for aesthetics (confess I personally > > do have preference on succinct APIs), but to urge people to use the > batched > > API for better latency when possible --- as stated in the KIP, my > > observation is that in practice most callers are actually going to get > > committed offsets for more than one partitions, and without deprecating > the > > old APIs it would be hard for them to realize that the new API does have > a > > benefit in performance. > > > > This is different from some of the existing APIs though -- e.g., Matthias > > mentioned about seek / seekToBeginning / seekToEnd, where only seekToXX > > have plurals and seek only have singulars. We can, of course, make > seekToXX > > with plurals as well just like commitSync/Async, but since seeks are > > non-blocking APIs (they rely on the follow-up polling call to talk to the > > brokers) either calling it multiple times with one partition each v.s. > > calling it one time with a plural makes little difference (still, I'd > argue > > that today we do not have a same-named function overloaded with both > types > > :P) On the other hand, committed, commitSync, offsetsForTimes etc > blocking > > calls are all in the form of plurals except > > > > * committed > > * position > > * partitionsFor > > > > My rationale was that 1) for consecutive calls of #position, mostly it > > would only require a single round-trip to brokers since we are trying to > > refresh fetching positions for all partitions anyways, and 2) for > > #partitionsFor, theoretically we could also consider to ask for multiple > > topics in one call since each blocking call potentially incurs one round > > trip, but I did not include it in the scope of this KIP only because I > have > > not observed too many usage patterns that are commonly calling it > > consecutively for multiple topics. At the moment, what I truly want to > > "improve" on is the committed calls, as in many cases I've seen it being > > called consecutively for multiple topic-partitions. > > > > Therefore, I'm still more inclined to deprecate the old APIs so that we > can > > enforce people to discover the new batching APIs for efficiency in this > > KIP. But if you feel that this compatibility is very crucial to maintain > I > > could be convinced. > > > > > > Guozhang > > > > On Tue, Sep 10, 2019 at 10:18 AM Matthias J. Sax > > wrote: > > > >> Thanks for the KIP Guozhang. > >> > >>> Another reason is that other functions of KafkaConsumers do not have > >> those > >>> overloaded functions to be consistent > >> > >> I tend to agree with Jason about keeping the existing methods. Your > >> argument does not seem to hold. I just checked the `Consumer` API, and > >> it's mix of overloads: > >> > >> Methods only talking `Collections` > >> > >> > >> > subscribe/assign/commitSync/commitAsyn/pause/resume/offsetsForTimes/beginningOffsets/endOffsets > >> > >> Method with overload taking `Collections` or as single value: > >> > >> seek/seekToBeginning/seekToEnd > >> > >> (those are strictly different methods, but they are semantically > related) > >> > >> Only talking single value: > >> > >> position/committed/partitionsFor > >> > >> > >> While you are strictly speaking correct, that there is no method with an > >> overload for `Collection` and single value, the API mix seems to suggest > >> that it might actually be worth to have corresponding overloads for all > >> methods instead of sticking to `Collections` only. > >> > >> > >> > >> About the return map: I agree that not containing any entry in the map > >> is better. It's not only consistent with other APIs but it also avoids > >> potential NPEs. > >> > >> > >> > >> -Matthias > >> > >> > >> On 9/10/19 10:04 AM, Jason Gustafson wrote: > >>>> I feel it not worth making committed to have both plurals and > >> singulars. > >>> > >>> Not sure I agree. If we had started with these new APIs from the > >> beginning, > >>> that may have been better, but we already have exposed the singular > APIs > >>> and users are depending on them. Not sure it's worth breaking > >> compatibility > >>> just for aesthetics. > >>> > >>> -Jason > >>> > >>> On Tue, Sep 10, 2019 at 9:41 AM Guozhang Wang > >> wrote: > >>> > >>>> Thanks Jason! > >>>> > >>>> On Tue, Sep 10, 2019 at 9:07 AM Jason Gustafson > >>>> wrote: > >>>> > >>>>> Hi Guozhang, > >>>>> > >>>>> I think the motivation for the new API makes sense. I've wanted > >> something > >>>>> like this in the past. That said, do you think there is a substantial > >>>>> benefit from deprecating the old API? I can still see it being > >> convenient > >>>>> in some cases and it's no real cost to maintain. > >>>>> > >>>>> > >>>> That's a good question. > >>>> > >>>> Personally I would like to keep a succinct set of APIs out of the box > >> and > >>>> let users who want more syntax sugars to add themselves as extended > >> classes > >>>> for example (KafkaConsumer is not a final class). > >>>> Another reason is that other functions of KafkaConsumers do not have > >> those > >>>> overloaded functions to be consistent, e.g. we do not have a > >>>> subscribe(single-topic), pause/resume(single-topic-partition) or > >>>> seekToBeginning(single-topic-partition). I feel it not worth making > >>>> committed to have both plurals and singulars. > >>>> > >>>> WDYT? > >>>> > >>>> > >>>>> Also, just a minor detail. If the partition has no committed offset, > >> will > >>>>> it be present in the map with a null value? > >>>>> > >>>>> I looked into the admin client's listConsumerGroupOffsets call when > >>>> creating the KIP, and to be consistent with that API my intention is > to > >> NOT > >>>> include the entry if a topic-partition does not have committed > offsets. > >>>> That said, if we feel returning an entry with null value is better for > >>>> programmability I can also do that (and will update wiki page to > >> clarify as > >>>> well). LMK. > >>>> > >>>> > >>>>> -Jason > >>>>> > >>>>> On Tue, Sep 10, 2019 at 6:09 AM Mickael Maison < > >> mickael.maison@gmail.com > >>>>> > >>>>> wrote: > >>>>> > >>>>>> +1 (non-binding), thanks Guozhang > >>>>>> > >>>>>> On Tue, Sep 10, 2019 at 1:14 AM Boyang Chen < > >>>> reluctanthero104@gmail.com> > >>>>>> wrote: > >>>>>>> > >>>>>>> Hey Guozhang, > >>>>>>> > >>>>>>> LGTM, +1 (non-binding) > >>>>>>> > >>>>>>> On Mon, Sep 9, 2019 at 5:07 PM Guozhang Wang > >>>>> wrote: > >>>>>>> > >>>>>>>> Hello folks, > >>>>>>>> > >>>>>>>> I've created a new KIP allowing consumer.committed to take a set > of > >>>>>>>> partitions instead of just one partition to allow batching effects > >>>> of > >>>>>> such > >>>>>>>> requests (the protocol already allows us to send multiple > >>>> partitions > >>>>>> in one > >>>>>>>> round-trip): > >>>>>>>> > >>>>>>>> > >>>>>>>> > >>>>>> > >>>>> > >>>> > >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-520%3A+Add+overloaded+Consumer%23committed+for+batching+partitions > >>>>>>>> > >>>>>>>> Since it is a pretty straight-forward KIP, I'm starting the VOTE > >>>> for > >>>>>> this > >>>>>>>> KIP directly. If there are any suggestions about this proposal, > >>>>> please > >>>>>> feel > >>>>>>>> free to share them in this thread. Thank you! > >>>>>>>> > >>>>>>>> > >>>>>>>> -- Guozhang > >>>>>>>> > >>>>>> > >>>>> > >>>> > >>>> > >>>> -- > >>>> -- Guozhang > >>>> > >>> > >> > >> > > > > --0000000000003cadfb059241d03d--