From dev-return-96561-archive-asf-public=cust-asf.ponee.io@kafka.apache.org Mon Jul 30 21:39:39 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id DEA09180630 for ; Mon, 30 Jul 2018 21:39:37 +0200 (CEST) Received: (qmail 33773 invoked by uid 500); 30 Jul 2018 19:39:36 -0000 Mailing-List: contact dev-help@kafka.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@kafka.apache.org Delivered-To: mailing list dev@kafka.apache.org Received: (qmail 33761 invoked by uid 99); 30 Jul 2018 19:39:35 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 30 Jul 2018 19:39:35 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 287DCCCE46 for ; Mon, 30 Jul 2018 19:39:35 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.869 X-Spam-Level: * X-Spam-Status: No, score=1.869 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, T_DKIMWL_WL_MED=-0.01] autolearn=disabled Authentication-Results: spamd1-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-us.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id Qa-UJDF2SQ8g for ; Mon, 30 Jul 2018 19:39:29 +0000 (UTC) Received: from mail-oi0-f52.google.com (mail-oi0-f52.google.com [209.85.218.52]) by mx1-lw-us.apache.org (ASF Mail Server at mx1-lw-us.apache.org) with ESMTPS id 101525F4A0 for ; Mon, 30 Jul 2018 19:39:29 +0000 (UTC) Received: by mail-oi0-f52.google.com with SMTP id y207-v6so23464480oie.13 for ; Mon, 30 Jul 2018 12:39:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=HEnBZPh3VunohjAZe8OL4QWPMoMjIfnWYjim2XZtJvQ=; b=FwvSB3J6YzCQ46UHmSbxm3VUQOLel/6TQv2KdaupXSqbSijiTtYxNZd+E13fpzHde6 fiBm2sg50vOy+9Gd43O5civ2yifJkR+4gpO3ECtY/b8kabO9+FiLLPUU8MakPXBe0kL3 OPwe/Mv0Pf0K5ZHoYkKMr66AByT2zBKdrbLm3qWc2oI3f7+lRg2quau1IyNwpoq05+iY RbLlvBVYx2F2YYJ2Ss7VCDx1ixkEQFx5uoEJFn/YNjCzY3Z5g3jmovxosrvyakq7NzjM CBYF+Gi5MFckwbd+T0a43RiZ2eWEWRmj29j39rIJg5OYsUnQ85A3pm5ASg34SIaNgn/K 4atQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=HEnBZPh3VunohjAZe8OL4QWPMoMjIfnWYjim2XZtJvQ=; b=ZhhnVB5MhooCxXshGARIL8wdaQqt3sRuVsXSPpJL6OmpL5Vvqh/8Ymix3bZNxAfEBX 72XyME/0cuZU5pPL/YHFGwJklvPBT6zqpVeTCEoCbCxvamaJSP0NqTaM0xc49ZJt3Fnm vs1L9LBn8PsG2UZVADCSwnOt06j96zAwCIhjpGoMxBsU+Pa4XP3jEMvbq/cs5jTP34uI msSjAClC+vniY4wLDAduS62bmLkMwwsBixonKmBBHkR83icr7qEBvIYZDr2thdbEb6Vw 8jVpPSjPjW7Gk01LyuYN09qbyszF6+k/I2aRdG8CKjuuIxIy87jdfdM2mXvRU4p1wIjm iU9w== X-Gm-Message-State: AOUpUlE3+JT5LNdkQ2YMdfL9SVnJ8p2s/kSSiYttAbkAt2yjsgjtM2aC B6DrUk26plKlqs5UBC2Opvrn+1BhkeGnIS8Mqay6CA== X-Google-Smtp-Source: AAOMgpeq7n9LhouE7GzkLKk79ke8VGSu/8rXC+ppmhrPEdZ8zQeold0i80Cbd88L+0s2oCd757bA1n3lHqUrM5xUXv8= X-Received: by 2002:aca:b885:: with SMTP id i127-v6mr20876706oif.180.1532979567550; Mon, 30 Jul 2018 12:39:27 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a4a:96ea:0:0:0:0:0 with HTTP; Mon, 30 Jul 2018 12:39:26 -0700 (PDT) In-Reply-To: References: <5b326ef6a561d731a7000002@polymail.io> <3509E229-8F62-41E3-B886-6C432A0C1085@gmail.com> From: Lucas Wang Date: Mon, 30 Jul 2018 12:39:26 -0700 Message-ID: Subject: Re: [DISCUSS] KIP-291: Have separate queues for control requests and data requests To: dev@kafka.apache.org Content-Type: multipart/alternative; boundary="00000000000092654105723c9e19" --00000000000092654105723c9e19 Content-Type: text/plain; charset="UTF-8" Thanks for your review, Dong. Ack that these configs will have a bigger impact for users. On the other hand, I would argue that the request queue becoming full may or may not be a rare scenario. How often the request queue gets full depends on the request incoming rate, the request processing rate, and the size of the request queue. When that happens, the dedicated endpoints design can better handle it than any of the previously discussed options. Another reason I made the change was that I have the same taste as Becket that it's a better separation of the control plane from the data plane. Finally, I want to clarify that this change is NOT motivated by the out-of-order processing discussion. The latter problem is orthogonal to this KIP, and it can happen in any of the design options we discussed for this KIP so far. So I'd like to address out-of-order processing separately in another thread, and avoid mentioning it in this KIP. Thanks, Lucas On Fri, Jul 27, 2018 at 7:51 PM, Dong Lin wrote: > Hey Lucas, > > Thanks for the update. > > The current KIP propose new broker configs "listeners.for.controller" and > "advertised.listeners.for.controller". This is going to be a big change > since listeners are among the most important configs that every user needs > to change. According to the rejected alternative section, it seems that the > reason to add these two configs is to improve performance when the data > request queue is full rather than for correctness. It should be a very rare > scenario and I am not sure we should add configs for all users just to > improve the performance in such rare scenario. > > Also, if the new design is based on the issues which are discovered in the > recent discussion, e.g. out of order processing if we don't use a dedicated > thread for controller request, it may be useful to explain the problem in > the motivation section. > > Thanks, > Dong > > On Fri, Jul 27, 2018 at 1:28 PM, Lucas Wang wrote: > > > A kind reminder for review of this KIP. > > > > Thank you very much! > > Lucas > > > > On Wed, Jul 25, 2018 at 10:23 PM, Lucas Wang > > wrote: > > > > > Hi All, > > > > > > I've updated the KIP by adding the dedicated endpoints for controller > > > connections, > > > and pinning threads for controller requests. > > > Also I've updated the title of this KIP. Please take a look and let me > > > know your feedback. > > > > > > Thanks a lot for your time! > > > Lucas > > > > > > On Tue, Jul 24, 2018 at 10:19 AM, Mayuresh Gharat < > > > gharatmayuresh15@gmail.com> wrote: > > > > > >> Hi Lucas, > > >> I agree, if we want to go forward with a separate controller plane and > > >> data > > >> plane and completely isolate them, having a separate port for > controller > > >> with a separate Acceptor and a Processor sounds ideal to me. > > >> > > >> Thanks, > > >> > > >> Mayuresh > > >> > > >> > > >> On Mon, Jul 23, 2018 at 11:04 PM Becket Qin > > wrote: > > >> > > >> > Hi Lucas, > > >> > > > >> > Yes, I agree that a dedicated end to end control flow would be > ideal. > > >> > > > >> > Thanks, > > >> > > > >> > Jiangjie (Becket) Qin > > >> > > > >> > On Tue, Jul 24, 2018 at 1:05 PM, Lucas Wang > > >> wrote: > > >> > > > >> > > Thanks for the comment, Becket. > > >> > > So far, we've been trying to avoid making any request handler > thread > > >> > > special. > > >> > > But if we were to follow that path in order to make the two planes > > >> more > > >> > > isolated, > > >> > > what do you think about also having a dedicated processor thread, > > >> > > and dedicated port for the controller? > > >> > > > > >> > > Today one processor thread can handle multiple connections, let's > > say > > >> 100 > > >> > > connections > > >> > > > > >> > > represented by connection0, ... connection99, among which > > >> connection0-98 > > >> > > are from clients, while connection99 is from > > >> > > > > >> > > the controller. Further let's say after one selector polling, > there > > >> are > > >> > > incoming requests on all connections. > > >> > > > > >> > > When the request queue is full, (either the data request being > full > > in > > >> > the > > >> > > two queue design, or > > >> > > > > >> > > the one single queue being full in the deque design), the > processor > > >> > thread > > >> > > will be blocked first > > >> > > > > >> > > when trying to enqueue the data request from connection0, then > > >> possibly > > >> > > blocked for the data request > > >> > > > > >> > > from connection1, ... etc even though the controller request is > > ready > > >> to > > >> > be > > >> > > enqueued. > > >> > > > > >> > > To solve this problem, it seems we would need to have a separate > > port > > >> > > dedicated to > > >> > > > > >> > > the controller, a dedicated processor thread, a dedicated > controller > > >> > > request queue, > > >> > > > > >> > > and pinning of one request handler thread for controller requests. > > >> > > > > >> > > Thanks, > > >> > > Lucas > > >> > > > > >> > > > > >> > > On Mon, Jul 23, 2018 at 6:00 PM, Becket Qin > > > >> > wrote: > > >> > > > > >> > > > Personally I am not fond of the dequeue approach simply because > it > > >> is > > >> > > > against the basic idea of isolating the controller plane and > data > > >> > plane. > > >> > > > With a single dequeue, theoretically speaking the controller > > >> requests > > >> > can > > >> > > > starve the clients requests. I would prefer the approach with a > > >> > separate > > >> > > > controller request queue and a dedicated controller request > > handler > > >> > > thread. > > >> > > > > > >> > > > Thanks, > > >> > > > > > >> > > > Jiangjie (Becket) Qin > > >> > > > > > >> > > > On Tue, Jul 24, 2018 at 8:16 AM, Lucas Wang < > > lucasatucla@gmail.com> > > >> > > wrote: > > >> > > > > > >> > > > > Sure, I can summarize the usage of correlation id. But before > I > > do > > >> > > that, > > >> > > > it > > >> > > > > seems > > >> > > > > the same out-of-order processing can also happen to Produce > > >> requests > > >> > > sent > > >> > > > > by producers, > > >> > > > > following the same example you described earlier. > > >> > > > > If that's the case, I think this probably deserves a separate > > doc > > >> and > > >> > > > > design independent of this KIP. > > >> > > > > > > >> > > > > Lucas > > >> > > > > > > >> > > > > > > >> > > > > > > >> > > > > On Mon, Jul 23, 2018 at 12:39 PM, Dong Lin < > lindong28@gmail.com > > > > > >> > > wrote: > > >> > > > > > > >> > > > > > Hey Lucas, > > >> > > > > > > > >> > > > > > Could you update the KIP if you are confident with the > > approach > > >> > which > > >> > > > > uses > > >> > > > > > correlation id? The idea around correlation id is kind of > > >> scattered > > >> > > > > across > > >> > > > > > multiple emails. It will be useful if other reviews can read > > the > > >> > KIP > > >> > > to > > >> > > > > > understand the latest proposal. > > >> > > > > > > > >> > > > > > Thanks, > > >> > > > > > Dong > > >> > > > > > > > >> > > > > > On Mon, Jul 23, 2018 at 12:32 PM, Mayuresh Gharat < > > >> > > > > > gharatmayuresh15@gmail.com> wrote: > > >> > > > > > > > >> > > > > > > I like the idea of the dequeue implementation by Lucas. > This > > >> will > > >> > > > help > > >> > > > > us > > >> > > > > > > avoid additional queue for controller and additional > configs > > >> in > > >> > > > Kafka. > > >> > > > > > > > > >> > > > > > > Thanks, > > >> > > > > > > > > >> > > > > > > Mayuresh > > >> > > > > > > > > >> > > > > > > On Sun, Jul 22, 2018 at 2:58 AM Becket Qin < > > >> becket.qin@gmail.com > > >> > > > > >> > > > > wrote: > > >> > > > > > > > > >> > > > > > > > Hi Jun, > > >> > > > > > > > > > >> > > > > > > > The usage of correlation ID might still be useful to > > address > > >> > the > > >> > > > > cases > > >> > > > > > > > that the controller epoch and leader epoch check are not > > >> > > sufficient > > >> > > > > to > > >> > > > > > > > guarantee correct behavior. For example, if the > controller > > >> > sends > > >> > > a > > >> > > > > > > > LeaderAndIsrRequest followed by a StopReplicaRequest, > and > > >> the > > >> > > > broker > > >> > > > > > > > processes it in the reverse order, the replica may still > > be > > >> > > wrongly > > >> > > > > > > > recreated, right? > > >> > > > > > > > > > >> > > > > > > > Thanks, > > >> > > > > > > > > > >> > > > > > > > Jiangjie (Becket) Qin > > >> > > > > > > > > > >> > > > > > > > > On Jul 22, 2018, at 11:47 AM, Jun Rao < > jun@confluent.io > > > > > >> > > wrote: > > >> > > > > > > > > > > >> > > > > > > > > Hmm, since we already use controller epoch and leader > > >> epoch > > >> > for > > >> > > > > > > properly > > >> > > > > > > > > caching the latest partition state, do we really need > > >> > > correlation > > >> > > > > id > > >> > > > > > > for > > >> > > > > > > > > ordering the controller requests? > > >> > > > > > > > > > > >> > > > > > > > > Thanks, > > >> > > > > > > > > > > >> > > > > > > > > Jun > > >> > > > > > > > > > > >> > > > > > > > > On Fri, Jul 20, 2018 at 2:18 PM, Becket Qin < > > >> > > > becket.qin@gmail.com> > > >> > > > > > > > wrote: > > >> > > > > > > > > > > >> > > > > > > > >> Lucas and Mayuresh, > > >> > > > > > > > >> > > >> > > > > > > > >> Good idea. The correlation id should work. > > >> > > > > > > > >> > > >> > > > > > > > >> In the ControllerChannelManager, a request will be > > resent > > >> > > until > > >> > > > a > > >> > > > > > > > response > > >> > > > > > > > >> is received. So if the controller to broker > connection > > >> > > > disconnects > > >> > > > > > > after > > >> > > > > > > > >> controller sends R1_a, but before the response of > R1_a > > is > > >> > > > > received, > > >> > > > > > a > > >> > > > > > > > >> disconnection may cause the controller to resend > R1_b. > > >> i.e. > > >> > > > until > > >> > > > > R1 > > >> > > > > > > is > > >> > > > > > > > >> acked, R2 won't be sent by the controller. > > >> > > > > > > > >> This gives two guarantees: > > >> > > > > > > > >> 1. Correlation id wise: R1_a < R1_b < R2. > > >> > > > > > > > >> 2. On the broker side, when R2 is seen, R1 must have > > been > > >> > > > > processed > > >> > > > > > at > > >> > > > > > > > >> least once. > > >> > > > > > > > >> > > >> > > > > > > > >> So on the broker side, with a single thread > controller > > >> > request > > >> > > > > > > handler, > > >> > > > > > > > the > > >> > > > > > > > >> logic should be: > > >> > > > > > > > >> 1. Process what ever request seen in the controller > > >> request > > >> > > > queue > > >> > > > > > > > >> 2. For the given epoch, drop request if its > correlation > > >> id > > >> > is > > >> > > > > > smaller > > >> > > > > > > > than > > >> > > > > > > > >> that of the last processed request. > > >> > > > > > > > >> > > >> > > > > > > > >> Thanks, > > >> > > > > > > > >> > > >> > > > > > > > >> Jiangjie (Becket) Qin > > >> > > > > > > > >> > > >> > > > > > > > >> On Fri, Jul 20, 2018 at 8:07 AM, Jun Rao < > > >> jun@confluent.io> > > >> > > > > wrote: > > >> > > > > > > > >> > > >> > > > > > > > >>> I agree that there is no strong ordering when there > > are > > >> > more > > >> > > > than > > >> > > > > > one > > >> > > > > > > > >>> socket connections. Currently, we rely on > > >> controllerEpoch > > >> > and > > >> > > > > > > > leaderEpoch > > >> > > > > > > > >>> to ensure that the receiving broker picks up the > > latest > > >> > state > > >> > > > for > > >> > > > > > > each > > >> > > > > > > > >>> partition. > > >> > > > > > > > >>> > > >> > > > > > > > >>> One potential issue with the dequeue approach is > that > > if > > >> > the > > >> > > > > queue > > >> > > > > > is > > >> > > > > > > > >> full, > > >> > > > > > > > >>> there is no guarantee that the controller requests > > will > > >> be > > >> > > > > enqueued > > >> > > > > > > > >>> quickly. > > >> > > > > > > > >>> > > >> > > > > > > > >>> Thanks, > > >> > > > > > > > >>> > > >> > > > > > > > >>> Jun > > >> > > > > > > > >>> > > >> > > > > > > > >>> On Fri, Jul 20, 2018 at 5:25 AM, Mayuresh Gharat < > > >> > > > > > > > >>> gharatmayuresh15@gmail.com > > >> > > > > > > > >>>> wrote: > > >> > > > > > > > >>> > > >> > > > > > > > >>>> Yea, the correlationId is only set to 0 in the > > >> > NetworkClient > > >> > > > > > > > >> constructor. > > >> > > > > > > > >>>> Since we reuse the same NetworkClient between > > >> Controller > > >> > and > > >> > > > the > > >> > > > > > > > >> broker, > > >> > > > > > > > >>> a > > >> > > > > > > > >>>> disconnection should not cause it to reset to 0, in > > >> which > > >> > > case > > >> > > > > it > > >> > > > > > > can > > >> > > > > > > > >> be > > >> > > > > > > > >>>> used to reject obsolete requests. > > >> > > > > > > > >>>> > > >> > > > > > > > >>>> Thanks, > > >> > > > > > > > >>>> > > >> > > > > > > > >>>> Mayuresh > > >> > > > > > > > >>>> > > >> > > > > > > > >>>> On Thu, Jul 19, 2018 at 1:52 PM Lucas Wang < > > >> > > > > lucasatucla@gmail.com > > >> > > > > > > > > >> > > > > > > > >>> wrote: > > >> > > > > > > > >>>> > > >> > > > > > > > >>>>> @Dong, > > >> > > > > > > > >>>>> Great example and explanation, thanks! > > >> > > > > > > > >>>>> > > >> > > > > > > > >>>>> @All > > >> > > > > > > > >>>>> Regarding the example given by Dong, it seems even > > if > > >> we > > >> > > use > > >> > > > a > > >> > > > > > > queue, > > >> > > > > > > > >>>> and a > > >> > > > > > > > >>>>> dedicated controller request handling thread, > > >> > > > > > > > >>>>> the same result can still happen because R1_a will > > be > > >> > sent > > >> > > on > > >> > > > > one > > >> > > > > > > > >>>>> connection, and R1_b & R2 will be sent on a > > different > > >> > > > > connection, > > >> > > > > > > > >>>>> and there is no ordering between different > > >> connections on > > >> > > the > > >> > > > > > > broker > > >> > > > > > > > >>>> side. > > >> > > > > > > > >>>>> I was discussing with Mayuresh offline, and it > seems > > >> > > > > correlation > > >> > > > > > id > > >> > > > > > > > >>>> within > > >> > > > > > > > >>>>> the same NetworkClient object is monotonically > > >> increasing > > >> > > and > > >> > > > > > never > > >> > > > > > > > >>>> reset, > > >> > > > > > > > >>>>> hence a broker can leverage that to properly > reject > > >> > > obsolete > > >> > > > > > > > >> requests. > > >> > > > > > > > >>>>> Thoughts? > > >> > > > > > > > >>>>> > > >> > > > > > > > >>>>> Thanks, > > >> > > > > > > > >>>>> Lucas > > >> > > > > > > > >>>>> > > >> > > > > > > > >>>>> On Thu, Jul 19, 2018 at 12:11 PM, Mayuresh Gharat > < > > >> > > > > > > > >>>>> gharatmayuresh15@gmail.com> wrote: > > >> > > > > > > > >>>>> > > >> > > > > > > > >>>>>> Actually nvm, correlationId is reset in case of > > >> > connection > > >> > > > > > loss, I > > >> > > > > > > > >>>> think. > > >> > > > > > > > >>>>>> > > >> > > > > > > > >>>>>> Thanks, > > >> > > > > > > > >>>>>> > > >> > > > > > > > >>>>>> Mayuresh > > >> > > > > > > > >>>>>> > > >> > > > > > > > >>>>>> On Thu, Jul 19, 2018 at 11:11 AM Mayuresh Gharat > < > > >> > > > > > > > >>>>>> gharatmayuresh15@gmail.com> > > >> > > > > > > > >>>>>> wrote: > > >> > > > > > > > >>>>>> > > >> > > > > > > > >>>>>>> I agree with Dong that out-of-order processing > can > > >> > happen > > >> > > > > with > > >> > > > > > > > >>>> having 2 > > >> > > > > > > > >>>>>>> separate queues as well and it can even happen > > >> today. > > >> > > > > > > > >>>>>>> Can we use the correlationId in the request from > > the > > >> > > > > controller > > >> > > > > > > > >> to > > >> > > > > > > > >>>> the > > >> > > > > > > > >>>>>>> broker to handle ordering ? > > >> > > > > > > > >>>>>>> > > >> > > > > > > > >>>>>>> Thanks, > > >> > > > > > > > >>>>>>> > > >> > > > > > > > >>>>>>> Mayuresh > > >> > > > > > > > >>>>>>> > > >> > > > > > > > >>>>>>> > > >> > > > > > > > >>>>>>> On Thu, Jul 19, 2018 at 6:41 AM Becket Qin < > > >> > > > > > becket.qin@gmail.com > > >> > > > > > > > >>> > > >> > > > > > > > >>>>> wrote: > > >> > > > > > > > >>>>>>> > > >> > > > > > > > >>>>>>>> Good point, Joel. I agree that a dedicated > > >> controller > > >> > > > > request > > >> > > > > > > > >>>> handling > > >> > > > > > > > >>>>>>>> thread would be a better isolation. It also > > solves > > >> the > > >> > > > > > > > >> reordering > > >> > > > > > > > >>>>> issue. > > >> > > > > > > > >>>>>>>> > > >> > > > > > > > >>>>>>>> On Thu, Jul 19, 2018 at 2:23 PM, Joel Koshy < > > >> > > > > > > > >> jjkoshy.w@gmail.com> > > >> > > > > > > > >>>>>> wrote: > > >> > > > > > > > >>>>>>>> > > >> > > > > > > > >>>>>>>>> Good example. I think this scenario can occur > in > > >> the > > >> > > > > current > > >> > > > > > > > >>> code > > >> > > > > > > > >>>> as > > >> > > > > > > > >>>>>>>> well > > >> > > > > > > > >>>>>>>>> but with even lower probability given that > there > > >> are > > >> > > > other > > >> > > > > > > > >>>>>>>> non-controller > > >> > > > > > > > >>>>>>>>> requests interleaved. It is still sketchy > though > > >> and > > >> > I > > >> > > > > think > > >> > > > > > a > > >> > > > > > > > >>>> safer > > >> > > > > > > > >>>>>>>>> approach would be separate queues and pinning > > >> > > controller > > >> > > > > > > > >> request > > >> > > > > > > > >>>>>>>> handling > > >> > > > > > > > >>>>>>>>> to one handler thread. > > >> > > > > > > > >>>>>>>>> > > >> > > > > > > > >>>>>>>>> On Wed, Jul 18, 2018 at 11:12 PM, Dong Lin < > > >> > > > > > > > >> lindong28@gmail.com > > >> > > > > > > > >>>> > > >> > > > > > > > >>>>>> wrote: > > >> > > > > > > > >>>>>>>>> > > >> > > > > > > > >>>>>>>>>> Hey Becket, > > >> > > > > > > > >>>>>>>>>> > > >> > > > > > > > >>>>>>>>>> I think you are right that there may be > > >> out-of-order > > >> > > > > > > > >>> processing. > > >> > > > > > > > >>>>>>>> However, > > >> > > > > > > > >>>>>>>>>> it seems that out-of-order processing may > also > > >> > happen > > >> > > > even > > >> > > > > > > > >> if > > >> > > > > > > > >>> we > > >> > > > > > > > >>>>>> use a > > >> > > > > > > > >>>>>>>>>> separate queue. > > >> > > > > > > > >>>>>>>>>> > > >> > > > > > > > >>>>>>>>>> Here is the example: > > >> > > > > > > > >>>>>>>>>> > > >> > > > > > > > >>>>>>>>>> - Controller sends R1 and got disconnected > > before > > >> > > > > receiving > > >> > > > > > > > >>>>>> response. > > >> > > > > > > > >>>>>>>>> Then > > >> > > > > > > > >>>>>>>>>> it reconnects and sends R2. Both requests now > > >> stay > > >> > in > > >> > > > the > > >> > > > > > > > >>>>> controller > > >> > > > > > > > >>>>>>>>>> request queue in the order they are sent. > > >> > > > > > > > >>>>>>>>>> - thread1 takes R1_a from the request queue > and > > >> then > > >> > > > > thread2 > > >> > > > > > > > >>>> takes > > >> > > > > > > > >>>>>> R2 > > >> > > > > > > > >>>>>>>>> from > > >> > > > > > > > >>>>>>>>>> the request queue almost at the same time. > > >> > > > > > > > >>>>>>>>>> - So R1_a and R2 are processed in parallel. > > >> There is > > >> > > > > chance > > >> > > > > > > > >>> that > > >> > > > > > > > >>>>>> R2's > > >> > > > > > > > >>>>>>>>>> processing is completed before R1. > > >> > > > > > > > >>>>>>>>>> > > >> > > > > > > > >>>>>>>>>> If out-of-order processing can happen for > both > > >> > > > approaches > > >> > > > > > > > >> with > > >> > > > > > > > >>>>> very > > >> > > > > > > > >>>>>>>> low > > >> > > > > > > > >>>>>>>>>> probability, it may not be worthwhile to add > > the > > >> > extra > > >> > > > > > > > >> queue. > > >> > > > > > > > >>>> What > > >> > > > > > > > >>>>>> do > > >> > > > > > > > >>>>>>>> you > > >> > > > > > > > >>>>>>>>>> think? > > >> > > > > > > > >>>>>>>>>> > > >> > > > > > > > >>>>>>>>>> Thanks, > > >> > > > > > > > >>>>>>>>>> Dong > > >> > > > > > > > >>>>>>>>>> > > >> > > > > > > > >>>>>>>>>> > > >> > > > > > > > >>>>>>>>>> On Wed, Jul 18, 2018 at 6:17 PM, Becket Qin < > > >> > > > > > > > >>>> becket.qin@gmail.com > > >> > > > > > > > >>>>>> > > >> > > > > > > > >>>>>>>>> wrote: > > >> > > > > > > > >>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>> Hi Mayuresh/Joel, > > >> > > > > > > > >>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>> Using the request channel as a dequeue was > > >> bright > > >> > up > > >> > > > some > > >> > > > > > > > >>> time > > >> > > > > > > > >>>>> ago > > >> > > > > > > > >>>>>>>> when > > >> > > > > > > > >>>>>>>>>> we > > >> > > > > > > > >>>>>>>>>>> initially thinking of prioritizing the > > request. > > >> The > > >> > > > > > > > >> concern > > >> > > > > > > > >>>> was > > >> > > > > > > > >>>>>> that > > >> > > > > > > > >>>>>>>>> the > > >> > > > > > > > >>>>>>>>>>> controller requests are supposed to be > > >> processed in > > >> > > > > order. > > >> > > > > > > > >>> If > > >> > > > > > > > >>>> we > > >> > > > > > > > >>>>>> can > > >> > > > > > > > >>>>>>>>>> ensure > > >> > > > > > > > >>>>>>>>>>> that there is one controller request in the > > >> request > > >> > > > > > > > >> channel, > > >> > > > > > > > >>>> the > > >> > > > > > > > >>>>>>>> order > > >> > > > > > > > >>>>>>>>> is > > >> > > > > > > > >>>>>>>>>>> not a concern. But in cases that there are > > more > > >> > than > > >> > > > one > > >> > > > > > > > >>>>>> controller > > >> > > > > > > > >>>>>>>>>> request > > >> > > > > > > > >>>>>>>>>>> inserted into the queue, the controller > > request > > >> > order > > >> > > > may > > >> > > > > > > > >>>> change > > >> > > > > > > > >>>>>> and > > >> > > > > > > > >>>>>>>>>> cause > > >> > > > > > > > >>>>>>>>>>> problem. For example, think about the > > following > > >> > > > sequence: > > >> > > > > > > > >>>>>>>>>>> 1. Controller successfully sent a request R1 > > to > > >> > > broker > > >> > > > > > > > >>>>>>>>>>> 2. Broker receives R1 and put the request to > > the > > >> > head > > >> > > > of > > >> > > > > > > > >> the > > >> > > > > > > > >>>>>> request > > >> > > > > > > > >>>>>>>>>> queue. > > >> > > > > > > > >>>>>>>>>>> 3. Controller to broker connection failed > and > > >> the > > >> > > > > > > > >> controller > > >> > > > > > > > >>>>>>>>> reconnected > > >> > > > > > > > >>>>>>>>>> to > > >> > > > > > > > >>>>>>>>>>> the broker. > > >> > > > > > > > >>>>>>>>>>> 4. Controller sends a request R2 to the > broker > > >> > > > > > > > >>>>>>>>>>> 5. Broker receives R2 and add it to the head > > of > > >> the > > >> > > > > > > > >> request > > >> > > > > > > > >>>>> queue. > > >> > > > > > > > >>>>>>>>>>> Now on the broker side, R2 will be processed > > >> before > > >> > > R1 > > >> > > > is > > >> > > > > > > > >>>>>> processed, > > >> > > > > > > > >>>>>>>>>> which > > >> > > > > > > > >>>>>>>>>>> may cause problem. > > >> > > > > > > > >>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>> Thanks, > > >> > > > > > > > >>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>> Jiangjie (Becket) Qin > > >> > > > > > > > >>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>> On Thu, Jul 19, 2018 at 3:23 AM, Joel Koshy > < > > >> > > > > > > > >>>>> jjkoshy.w@gmail.com> > > >> > > > > > > > >>>>>>>>> wrote: > > >> > > > > > > > >>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>> @Mayuresh - I like your idea. It appears to > > be > > >> a > > >> > > > simpler > > >> > > > > > > > >>>> less > > >> > > > > > > > >>>>>>>>> invasive > > >> > > > > > > > >>>>>>>>>>>> alternative and it should work. > > >> Jun/Becket/others, > > >> > > do > > >> > > > > > > > >> you > > >> > > > > > > > >>>> see > > >> > > > > > > > >>>>>> any > > >> > > > > > > > >>>>>>>>>>> pitfalls > > >> > > > > > > > >>>>>>>>>>>> with this approach? > > >> > > > > > > > >>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>> On Wed, Jul 18, 2018 at 12:03 PM, Lucas > Wang > > < > > >> > > > > > > > >>>>>>>> lucasatucla@gmail.com> > > >> > > > > > > > >>>>>>>>>>>> wrote: > > >> > > > > > > > >>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>> @Mayuresh, > > >> > > > > > > > >>>>>>>>>>>>> That's a very interesting idea that I > > haven't > > >> > > thought > > >> > > > > > > > >>>>> before. > > >> > > > > > > > >>>>>>>>>>>>> It seems to solve our problem at hand > pretty > > >> > well, > > >> > > > and > > >> > > > > > > > >>>> also > > >> > > > > > > > >>>>>>>>>>>>> avoids the need to have a new size metric > > and > > >> > > > capacity > > >> > > > > > > > >>>>> config > > >> > > > > > > > >>>>>>>>>>>>> for the controller request queue. In fact, > > if > > >> we > > >> > > were > > >> > > > > > > > >> to > > >> > > > > > > > >>>>> adopt > > >> > > > > > > > >>>>>>>>>>>>> this design, there is no public interface > > >> change, > > >> > > and > > >> > > > > > > > >> we > > >> > > > > > > > >>>>>>>>>>>>> probably don't need a KIP. > > >> > > > > > > > >>>>>>>>>>>>> Also implementation wise, it seems > > >> > > > > > > > >>>>>>>>>>>>> the java class LinkedBlockingQueue can > > readily > > >> > > > satisfy > > >> > > > > > > > >>> the > > >> > > > > > > > >>>>>>>>>> requirement > > >> > > > > > > > >>>>>>>>>>>>> by supporting a capacity, and also > allowing > > >> > > inserting > > >> > > > > > > > >> at > > >> > > > > > > > >>>>> both > > >> > > > > > > > >>>>>>>> ends. > > >> > > > > > > > >>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>> My only concern is that this design is > tied > > to > > >> > the > > >> > > > > > > > >>>>> coincidence > > >> > > > > > > > >>>>>>>> that > > >> > > > > > > > >>>>>>>>>>>>> we have two request priorities and there > are > > >> two > > >> > > ends > > >> > > > > > > > >>> to a > > >> > > > > > > > >>>>>>>> deque. > > >> > > > > > > > >>>>>>>>>>>>> Hence by using the proposed design, it > seems > > >> the > > >> > > > > > > > >> network > > >> > > > > > > > >>>>> layer > > >> > > > > > > > >>>>>>>> is > > >> > > > > > > > >>>>>>>>>>>>> more tightly coupled with upper layer > logic, > > >> e.g. > > >> > > if > > >> > > > > > > > >> we > > >> > > > > > > > >>>> were > > >> > > > > > > > >>>>>> to > > >> > > > > > > > >>>>>>>> add > > >> > > > > > > > >>>>>>>>>>>>> an extra priority level in the future for > > some > > >> > > > reason, > > >> > > > > > > > >>> we > > >> > > > > > > > >>>>>> would > > >> > > > > > > > >>>>>>>>>>> probably > > >> > > > > > > > >>>>>>>>>>>>> need to go back to the design of separate > > >> queues, > > >> > > one > > >> > > > > > > > >>> for > > >> > > > > > > > >>>>> each > > >> > > > > > > > >>>>>>>>>> priority > > >> > > > > > > > >>>>>>>>>>>>> level. > > >> > > > > > > > >>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>> In summary, I'm ok with both designs and > > lean > > >> > > toward > > >> > > > > > > > >>> your > > >> > > > > > > > >>>>>>>> suggested > > >> > > > > > > > >>>>>>>>>>>>> approach. > > >> > > > > > > > >>>>>>>>>>>>> Let's hear what others think. > > >> > > > > > > > >>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>> @Becket, > > >> > > > > > > > >>>>>>>>>>>>> In light of Mayuresh's suggested new > design, > > >> I'm > > >> > > > > > > > >>> answering > > >> > > > > > > > >>>>>> your > > >> > > > > > > > >>>>>>>>>>> question > > >> > > > > > > > >>>>>>>>>>>>> only in the context > > >> > > > > > > > >>>>>>>>>>>>> of the current KIP design: I think your > > >> > suggestion > > >> > > > > > > > >> makes > > >> > > > > > > > >>>>>> sense, > > >> > > > > > > > >>>>>>>> and > > >> > > > > > > > >>>>>>>>>> I'm > > >> > > > > > > > >>>>>>>>>>>> ok > > >> > > > > > > > >>>>>>>>>>>>> with removing the capacity config and > > >> > > > > > > > >>>>>>>>>>>>> just relying on the default value of 20 > > being > > >> > > > > > > > >> sufficient > > >> > > > > > > > >>>>>> enough. > > >> > > > > > > > >>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>> Thanks, > > >> > > > > > > > >>>>>>>>>>>>> Lucas > > >> > > > > > > > >>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>> On Wed, Jul 18, 2018 at 9:57 AM, Mayuresh > > >> Gharat > > >> > < > > >> > > > > > > > >>>>>>>>>>>>> gharatmayuresh15@gmail.com > > >> > > > > > > > >>>>>>>>>>>>>> wrote: > > >> > > > > > > > >>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>> Hi Lucas, > > >> > > > > > > > >>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>> Seems like the main intent here is to > > >> prioritize > > >> > > the > > >> > > > > > > > >>>>>>>> controller > > >> > > > > > > > >>>>>>>>>>> request > > >> > > > > > > > >>>>>>>>>>>>>> over any other requests. > > >> > > > > > > > >>>>>>>>>>>>>> In that case, we can change the request > > queue > > >> > to a > > >> > > > > > > > >>>>> dequeue, > > >> > > > > > > > >>>>>>>> where > > >> > > > > > > > >>>>>>>>>> you > > >> > > > > > > > >>>>>>>>>>>>>> always insert the normal requests > (produce, > > >> > > > > > > > >>>> consume,..etc) > > >> > > > > > > > >>>>>> to > > >> > > > > > > > >>>>>>>> the > > >> > > > > > > > >>>>>>>>>> end > > >> > > > > > > > >>>>>>>>>>>> of > > >> > > > > > > > >>>>>>>>>>>>>> the dequeue, but if its a controller > > request, > > >> > you > > >> > > > > > > > >>> insert > > >> > > > > > > > >>>>> it > > >> > > > > > > > >>>>>> to > > >> > > > > > > > >>>>>>>>> the > > >> > > > > > > > >>>>>>>>>>> head > > >> > > > > > > > >>>>>>>>>>>>> of > > >> > > > > > > > >>>>>>>>>>>>>> the queue. This ensures that the > controller > > >> > > request > > >> > > > > > > > >>> will > > >> > > > > > > > >>>>> be > > >> > > > > > > > >>>>>>>> given > > >> > > > > > > > >>>>>>>>>>>> higher > > >> > > > > > > > >>>>>>>>>>>>>> priority over other requests. > > >> > > > > > > > >>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>> Also since we only read one request from > > the > > >> > > socket > > >> > > > > > > > >>> and > > >> > > > > > > > >>>>> mute > > >> > > > > > > > >>>>>>>> it > > >> > > > > > > > >>>>>>>>> and > > >> > > > > > > > >>>>>>>>>>>> only > > >> > > > > > > > >>>>>>>>>>>>>> unmute it after handling the request, > this > > >> would > > >> > > > > > > > >>> ensure > > >> > > > > > > > >>>>> that > > >> > > > > > > > >>>>>>>> we > > >> > > > > > > > >>>>>>>>>> don't > > >> > > > > > > > >>>>>>>>>>>>>> handle controller requests out of order. > > >> > > > > > > > >>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>> With this approach we can avoid the > second > > >> queue > > >> > > and > > >> > > > > > > > >>> the > > >> > > > > > > > >>>>>>>>> additional > > >> > > > > > > > >>>>>>>>>>>>> config > > >> > > > > > > > >>>>>>>>>>>>>> for the size of the queue. > > >> > > > > > > > >>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>> What do you think ? > > >> > > > > > > > >>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>> Thanks, > > >> > > > > > > > >>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>> Mayuresh > > >> > > > > > > > >>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>> On Wed, Jul 18, 2018 at 3:05 AM Becket > Qin > > < > > >> > > > > > > > >>>>>>>> becket.qin@gmail.com > > >> > > > > > > > >>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>> wrote: > > >> > > > > > > > >>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>> Hey Joel, > > >> > > > > > > > >>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>> Thank for the detail explanation. I > agree > > >> the > > >> > > > > > > > >>> current > > >> > > > > > > > >>>>>> design > > >> > > > > > > > >>>>>>>>>> makes > > >> > > > > > > > >>>>>>>>>>>>> sense. > > >> > > > > > > > >>>>>>>>>>>>>>> My confusion is about whether the new > > config > > >> > for > > >> > > > > > > > >> the > > >> > > > > > > > >>>>>>>> controller > > >> > > > > > > > >>>>>>>>>>> queue > > >> > > > > > > > >>>>>>>>>>>>>>> capacity is necessary. I cannot think > of a > > >> case > > >> > > in > > >> > > > > > > > >>>> which > > >> > > > > > > > >>>>>>>> users > > >> > > > > > > > >>>>>>>>>>> would > > >> > > > > > > > >>>>>>>>>>>>>> change > > >> > > > > > > > >>>>>>>>>>>>>>> it. > > >> > > > > > > > >>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>> Thanks, > > >> > > > > > > > >>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>> Jiangjie (Becket) Qin > > >> > > > > > > > >>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>> On Wed, Jul 18, 2018 at 6:00 PM, Becket > > Qin > > >> < > > >> > > > > > > > >>>>>>>>>> becket.qin@gmail.com> > > >> > > > > > > > >>>>>>>>>>>>>> wrote: > > >> > > > > > > > >>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>> Hi Lucas, > > >> > > > > > > > >>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>> I guess my question can be rephrased to > > >> "do we > > >> > > > > > > > >>>> expect > > >> > > > > > > > >>>>>>>> user to > > >> > > > > > > > >>>>>>>>>>> ever > > >> > > > > > > > >>>>>>>>>>>>>> change > > >> > > > > > > > >>>>>>>>>>>>>>>> the controller request queue capacity"? > > If > > >> we > > >> > > > > > > > >>> agree > > >> > > > > > > > >>>>> that > > >> > > > > > > > >>>>>>>> 20 > > >> > > > > > > > >>>>>>>>> is > > >> > > > > > > > >>>>>>>>>>>>> already > > >> > > > > > > > >>>>>>>>>>>>>> a > > >> > > > > > > > >>>>>>>>>>>>>>>> very generous default number and we do > > not > > >> > > > > > > > >> expect > > >> > > > > > > > >>>> user > > >> > > > > > > > >>>>>> to > > >> > > > > > > > >>>>>>>>>> change > > >> > > > > > > > >>>>>>>>>>>> it, > > >> > > > > > > > >>>>>>>>>>>>> is > > >> > > > > > > > >>>>>>>>>>>>>>> it > > >> > > > > > > > >>>>>>>>>>>>>>>> still necessary to expose this as a > > config? > > >> > > > > > > > >>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>> Thanks, > > >> > > > > > > > >>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>> Jiangjie (Becket) Qin > > >> > > > > > > > >>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>> On Wed, Jul 18, 2018 at 2:29 AM, Lucas > > >> Wang < > > >> > > > > > > > >>>>>>>>>>> lucasatucla@gmail.com > > >> > > > > > > > >>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>> wrote: > > >> > > > > > > > >>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>>> @Becket > > >> > > > > > > > >>>>>>>>>>>>>>>>> 1. Thanks for the comment. You are > right > > >> that > > >> > > > > > > > >>>>> normally > > >> > > > > > > > >>>>>>>> there > > >> > > > > > > > >>>>>>>>>>>> should > > >> > > > > > > > >>>>>>>>>>>>> be > > >> > > > > > > > >>>>>>>>>>>>>>>>> just > > >> > > > > > > > >>>>>>>>>>>>>>>>> one controller request because of > > muting, > > >> > > > > > > > >>>>>>>>>>>>>>>>> and I had NOT intended to say there > > would > > >> be > > >> > > > > > > > >> many > > >> > > > > > > > >>>>>>>> enqueued > > >> > > > > > > > >>>>>>>>>>>>> controller > > >> > > > > > > > >>>>>>>>>>>>>>>>> requests. > > >> > > > > > > > >>>>>>>>>>>>>>>>> I went through the KIP again, and I'm > > not > > >> > sure > > >> > > > > > > > >>>> which > > >> > > > > > > > >>>>>> part > > >> > > > > > > > >>>>>>>>>>> conveys > > >> > > > > > > > >>>>>>>>>>>>> that > > >> > > > > > > > >>>>>>>>>>>>>>>>> info. > > >> > > > > > > > >>>>>>>>>>>>>>>>> I'd be happy to revise if you point it > > out > > >> > the > > >> > > > > > > > >>>>> section. > > >> > > > > > > > >>>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>>> 2. Though it should not happen in > normal > > >> > > > > > > > >>>> conditions, > > >> > > > > > > > >>>>>> the > > >> > > > > > > > >>>>>>>>>> current > > >> > > > > > > > >>>>>>>>>>>>>> design > > >> > > > > > > > >>>>>>>>>>>>>>>>> does not preclude multiple controllers > > >> > running > > >> > > > > > > > >>>>>>>>>>>>>>>>> at the same time, hence if we don't > have > > >> the > > >> > > > > > > > >>>>> controller > > >> > > > > > > > >>>>>>>>> queue > > >> > > > > > > > >>>>>>>>>>>>> capacity > > >> > > > > > > > >>>>>>>>>>>>>>>>> config and simply make its capacity to > > be > > >> 1, > > >> > > > > > > > >>>>>>>>>>>>>>>>> network threads handling requests from > > >> > > > > > > > >> different > > >> > > > > > > > >>>>>>>> controllers > > >> > > > > > > > >>>>>>>>>>> will > > >> > > > > > > > >>>>>>>>>>>> be > > >> > > > > > > > >>>>>>>>>>>>>>>>> blocked during those troublesome > times, > > >> > > > > > > > >>>>>>>>>>>>>>>>> which is probably not what we want. On > > the > > >> > > > > > > > >> other > > >> > > > > > > > >>>>> hand, > > >> > > > > > > > >>>>>>>>> adding > > >> > > > > > > > >>>>>>>>>>> the > > >> > > > > > > > >>>>>>>>>>>>>> extra > > >> > > > > > > > >>>>>>>>>>>>>>>>> config with a default value, say 20, > > >> guards > > >> > us > > >> > > > > > > > >>> from > > >> > > > > > > > >>>>>>>> issues > > >> > > > > > > > >>>>>>>>> in > > >> > > > > > > > >>>>>>>>>>>> those > > >> > > > > > > > >>>>>>>>>>>>>>>>> troublesome times, and IMO there isn't > > >> much > > >> > > > > > > > >>>> downside > > >> > > > > > > > >>>>> of > > >> > > > > > > > >>>>>>>>> adding > > >> > > > > > > > >>>>>>>>>>> the > > >> > > > > > > > >>>>>>>>>>>>>> extra > > >> > > > > > > > >>>>>>>>>>>>>>>>> config. > > >> > > > > > > > >>>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>>> @Mayuresh > > >> > > > > > > > >>>>>>>>>>>>>>>>> Good catch, this sentence is an > obsolete > > >> > > > > > > > >>> statement > > >> > > > > > > > >>>>>> based > > >> > > > > > > > >>>>>>>> on > > >> > > > > > > > >>>>>>>>> a > > >> > > > > > > > >>>>>>>>>>>>> previous > > >> > > > > > > > >>>>>>>>>>>>>>>>> design. I've revised the wording in > the > > >> KIP. > > >> > > > > > > > >>>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>>> Thanks, > > >> > > > > > > > >>>>>>>>>>>>>>>>> Lucas > > >> > > > > > > > >>>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>>> On Tue, Jul 17, 2018 at 10:33 AM, > > Mayuresh > > >> > > > > > > > >>> Gharat < > > >> > > > > > > > >>>>>>>>>>>>>>>>> gharatmayuresh15@gmail.com> wrote: > > >> > > > > > > > >>>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>>>> Hi Lucas, > > >> > > > > > > > >>>>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>>>> Thanks for the KIP. > > >> > > > > > > > >>>>>>>>>>>>>>>>>> I am trying to understand why you > think > > >> "The > > >> > > > > > > > >>>> memory > > >> > > > > > > > >>>>>>>>>>> consumption > > >> > > > > > > > >>>>>>>>>>>>> can > > >> > > > > > > > >>>>>>>>>>>>>>> rise > > >> > > > > > > > >>>>>>>>>>>>>>>>>> given the total number of queued > > requests > > >> > can > > >> > > > > > > > >>> go > > >> > > > > > > > >>>> up > > >> > > > > > > > >>>>>> to > > >> > > > > > > > >>>>>>>> 2x" > > >> > > > > > > > >>>>>>>>>> in > > >> > > > > > > > >>>>>>>>>>>> the > > >> > > > > > > > >>>>>>>>>>>>>>> impact > > >> > > > > > > > >>>>>>>>>>>>>>>>>> section. Normally the requests from > > >> > > > > > > > >> controller > > >> > > > > > > > >>>> to a > > >> > > > > > > > >>>>>>>> Broker > > >> > > > > > > > >>>>>>>>>> are > > >> > > > > > > > >>>>>>>>>>>> not > > >> > > > > > > > >>>>>>>>>>>>>>> high > > >> > > > > > > > >>>>>>>>>>>>>>>>>> volume, right ? > > >> > > > > > > > >>>>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>>>> Thanks, > > >> > > > > > > > >>>>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>>>> Mayuresh > > >> > > > > > > > >>>>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>>>> On Tue, Jul 17, 2018 at 5:06 AM > Becket > > >> Qin < > > >> > > > > > > > >>>>>>>>>>>> becket.qin@gmail.com> > > >> > > > > > > > >>>>>>>>>>>>>>>>> wrote: > > >> > > > > > > > >>>>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> Thanks for the KIP, Lucas. > Separating > > >> the > > >> > > > > > > > >>>> control > > >> > > > > > > > >>>>>>>> plane > > >> > > > > > > > >>>>>>>>>> from > > >> > > > > > > > >>>>>>>>>>>> the > > >> > > > > > > > >>>>>>>>>>>>>>> data > > >> > > > > > > > >>>>>>>>>>>>>>>>>> plane > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> makes a lot of sense. > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> In the KIP you mentioned that the > > >> > > > > > > > >> controller > > >> > > > > > > > >>>>>> request > > >> > > > > > > > >>>>>>>>> queue > > >> > > > > > > > >>>>>>>>>>> may > > >> > > > > > > > >>>>>>>>>>>>>> have > > >> > > > > > > > >>>>>>>>>>>>>>>>> many > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> requests in it. Will this be a > common > > >> case? > > >> > > > > > > > >>> The > > >> > > > > > > > >>>>>>>>> controller > > >> > > > > > > > >>>>>>>>>>>>>> requests > > >> > > > > > > > >>>>>>>>>>>>>>>>> still > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> goes through the SocketServer. The > > >> > > > > > > > >>> SocketServer > > >> > > > > > > > >>>>>> will > > >> > > > > > > > >>>>>>>>> mute > > >> > > > > > > > >>>>>>>>>>> the > > >> > > > > > > > >>>>>>>>>>>>>>> channel > > >> > > > > > > > >>>>>>>>>>>>>>>>>> once > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> a request is read and put into the > > >> request > > >> > > > > > > > >>>>> channel. > > >> > > > > > > > >>>>>>>> So > > >> > > > > > > > >>>>>>>>>>>> assuming > > >> > > > > > > > >>>>>>>>>>>>>>> there > > >> > > > > > > > >>>>>>>>>>>>>>>>> is > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> only one connection between > controller > > >> and > > >> > > > > > > > >>> each > > >> > > > > > > > >>>>>>>> broker, > > >> > > > > > > > >>>>>>>>> on > > >> > > > > > > > >>>>>>>>>>> the > > >> > > > > > > > >>>>>>>>>>>>>>> broker > > >> > > > > > > > >>>>>>>>>>>>>>>>>> side, > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> there should be only one controller > > >> request > > >> > > > > > > > >>> in > > >> > > > > > > > >>>>> the > > >> > > > > > > > >>>>>>>>>>> controller > > >> > > > > > > > >>>>>>>>>>>>>>> request > > >> > > > > > > > >>>>>>>>>>>>>>>>>> queue > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> at any given time. If that is the > > case, > > >> do > > >> > > > > > > > >> we > > >> > > > > > > > >>>>> need > > >> > > > > > > > >>>>>> a > > >> > > > > > > > >>>>>>>>>>> separate > > >> > > > > > > > >>>>>>>>>>>>>>>>> controller > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> request queue capacity config? The > > >> default > > >> > > > > > > > >>>> value > > >> > > > > > > > >>>>> 20 > > >> > > > > > > > >>>>>>>>> means > > >> > > > > > > > >>>>>>>>>>> that > > >> > > > > > > > >>>>>>>>>>>>> we > > >> > > > > > > > >>>>>>>>>>>>>>>>> expect > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> there are 20 controller switches to > > >> happen > > >> > > > > > > > >>> in a > > >> > > > > > > > >>>>>> short > > >> > > > > > > > >>>>>>>>>> period > > >> > > > > > > > >>>>>>>>>>>> of > > >> > > > > > > > >>>>>>>>>>>>>>> time. > > >> > > > > > > > >>>>>>>>>>>>>>>>> I > > >> > > > > > > > >>>>>>>>>>>>>>>>>> am > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> not sure whether someone should > > increase > > >> > > > > > > > >> the > > >> > > > > > > > >>>>>>>> controller > > >> > > > > > > > >>>>>>>>>>>> request > > >> > > > > > > > >>>>>>>>>>>>>>> queue > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> capacity to handle such case, as it > > >> seems > > >> > > > > > > > >>>>>> indicating > > >> > > > > > > > >>>>>>>>>>> something > > >> > > > > > > > >>>>>>>>>>>>>> very > > >> > > > > > > > >>>>>>>>>>>>>>>>> wrong > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> has happened. > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> Thanks, > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> Jiangjie (Becket) Qin > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> On Fri, Jul 13, 2018 at 1:10 PM, > Dong > > >> Lin < > > >> > > > > > > > >>>>>>>>>>>> lindong28@gmail.com> > > >> > > > > > > > >>>>>>>>>>>>>>>>> wrote: > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>> Thanks for the update Lucas. > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>> I think the motivation section is > > >> > > > > > > > >>> intuitive. > > >> > > > > > > > >>>> It > > >> > > > > > > > >>>>>>>> will > > >> > > > > > > > >>>>>>>>> be > > >> > > > > > > > >>>>>>>>>>> good > > >> > > > > > > > >>>>>>>>>>>>> to > > >> > > > > > > > >>>>>>>>>>>>>>>>> learn > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> more > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>> about the comments from other > > >> reviewers. > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>> On Thu, Jul 12, 2018 at 9:48 PM, > > Lucas > > >> > > > > > > > >>> Wang < > > >> > > > > > > > >>>>>>>>>>>>>>> lucasatucla@gmail.com> > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> wrote: > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> Hi Dong, > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> I've updated the motivation > section > > of > > >> > > > > > > > >>> the > > >> > > > > > > > >>>>> KIP > > >> > > > > > > > >>>>>> by > > >> > > > > > > > >>>>>>>>>>>> explaining > > >> > > > > > > > >>>>>>>>>>>>>> the > > >> > > > > > > > >>>>>>>>>>>>>>>>>> cases > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>> that > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> would have user impacts. > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> Please take a look at let me know > > your > > >> > > > > > > > >>>>>> comments. > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> Thanks, > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> Lucas > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> On Mon, Jul 9, 2018 at 5:53 PM, > > Lucas > > >> > > > > > > > >>> Wang > > >> > > > > > > > >>>> < > > >> > > > > > > > >>>>>>>>>>>>>>> lucasatucla@gmail.com > > >> > > > > > > > >>>>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>> wrote: > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> Hi Dong, > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> The simulation of disk being slow > > is > > >> > > > > > > > >>>> merely > > >> > > > > > > > >>>>>>>> for me > > >> > > > > > > > >>>>>>>>>> to > > >> > > > > > > > >>>>>>>>>>>>> easily > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> construct > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>> a > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> testing scenario > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> with a backlog of produce > requests. > > >> > > > > > > > >> In > > >> > > > > > > > >>>>>>>> production, > > >> > > > > > > > >>>>>>>>>>> other > > >> > > > > > > > >>>>>>>>>>>>>> than > > >> > > > > > > > >>>>>>>>>>>>>>>>> the > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> disk > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> being slow, a backlog of > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> produce requests may also be > caused > > >> > > > > > > > >> by > > >> > > > > > > > >>>> high > > >> > > > > > > > >>>>>>>>> produce > > >> > > > > > > > >>>>>>>>>>> QPS. > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> In that case, we may not want to > > kill > > >> > > > > > > > >>> the > > >> > > > > > > > >>>>>>>> broker > > >> > > > > > > > >>>>>>>>> and > > >> > > > > > > > >>>>>>>>>>>>> that's > > >> > > > > > > > >>>>>>>>>>>>>>> when > > >> > > > > > > > >>>>>>>>>>>>>>>>>> this > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>> KIP > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> can be useful, both for JBOD > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> and non-JBOD setup. > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> Going back to your previous > > question > > >> > > > > > > > >>>> about > > >> > > > > > > > >>>>>> each > > >> > > > > > > > >>>>>>>>>>>>>> ProduceRequest > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> covering > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> 20 > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> partitions that are randomly > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> distributed, let's say a > > LeaderAndIsr > > >> > > > > > > > >>>>> request > > >> > > > > > > > >>>>>>>> is > > >> > > > > > > > >>>>>>>>>>>> enqueued > > >> > > > > > > > >>>>>>>>>>>>>> that > > >> > > > > > > > >>>>>>>>>>>>>>>>>> tries > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> to > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> switch the current broker, say > > >> > > > > > > > >> broker0, > > >> > > > > > > > >>>>> from > > >> > > > > > > > >>>>>>>>> leader > > >> > > > > > > > >>>>>>>>>> to > > >> > > > > > > > >>>>>>>>>>>>>>> follower > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> *for one of the partitions*, say > > >> > > > > > > > >>>> *test-0*. > > >> > > > > > > > >>>>>> For > > >> > > > > > > > >>>>>>>> the > > >> > > > > > > > >>>>>>>>>>> sake > > >> > > > > > > > >>>>>>>>>>>> of > > >> > > > > > > > >>>>>>>>>>>>>>>>>> argument, > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> let's also assume the other > > brokers, > > >> > > > > > > > >>> say > > >> > > > > > > > >>>>>>>> broker1, > > >> > > > > > > > >>>>>>>>>> have > > >> > > > > > > > >>>>>>>>>>>>>>> *stopped* > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>> fetching > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> from > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> the current broker, i.e. broker0. > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> 1. If the enqueued produce > requests > > >> > > > > > > > >>> have > > >> > > > > > > > >>>>>> acks = > > >> > > > > > > > >>>>>>>>> -1 > > >> > > > > > > > >>>>>>>>>>>> (ALL) > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> 1.1 without this KIP, the > > >> > > > > > > > >>>> ProduceRequests > > >> > > > > > > > >>>>>>>> ahead > > >> > > > > > > > >>>>>>>>> of > > >> > > > > > > > >>>>>>>>>>>>>>>>> LeaderAndISR > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> will > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>> be > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> put into the purgatory, > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> and since they'll never be > > >> > > > > > > > >>>>> replicated > > >> > > > > > > > >>>>>>>> to > > >> > > > > > > > >>>>>>>>>> other > > >> > > > > > > > >>>>>>>>>>>>>> brokers > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> (because > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> of > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> the assumption made above), they > > will > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> be completed either when > the > > >> > > > > > > > >>>>>>>> LeaderAndISR > > >> > > > > > > > >>>>>>>>>>>> request > > >> > > > > > > > >>>>>>>>>>>>> is > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> processed > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>> or > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> when the timeout happens. > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> 1.2 With this KIP, broker0 will > > >> > > > > > > > >>>>> immediately > > >> > > > > > > > >>>>>>>>>>> transition > > >> > > > > > > > >>>>>>>>>>>>> the > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> partition > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> test-0 to become a follower, > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> after the current broker > > sees > > >> > > > > > > > >>> the > > >> > > > > > > > >>>>>>>>>> replication > > >> > > > > > > > >>>>>>>>>>> of > > >> > > > > > > > >>>>>>>>>>>>> the > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> remaining > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>> 19 > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> partitions, it can send a > response > > >> > > > > > > > >>>>> indicating > > >> > > > > > > > >>>>>>>> that > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> it's no longer the leader > > for > > >> > > > > > > > >>> the > > >> > > > > > > > >>>>>>>>> "test-0". > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> To see the latency difference > > >> > > > > > > > >> between > > >> > > > > > > > >>>> 1.1 > > >> > > > > > > > >>>>>> and > > >> > > > > > > > >>>>>>>>> 1.2, > > >> > > > > > > > >>>>>>>>>>>> let's > > >> > > > > > > > >>>>>>>>>>>>>> say > > >> > > > > > > > >>>>>>>>>>>>>>>>>> there > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>> are > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> 24K produce requests ahead of the > > >> > > > > > > > >>>>>> LeaderAndISR, > > >> > > > > > > > >>>>>>>>> and > > >> > > > > > > > >>>>>>>>>>>> there > > >> > > > > > > > >>>>>>>>>>>>>> are > > >> > > > > > > > >>>>>>>>>>>>>>> 8 > > >> > > > > > > > >>>>>>>>>>>>>>>>> io > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> threads, > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> so each io thread will process > > >> > > > > > > > >>>>>> approximately > > >> > > > > > > > >>>>>>>>> 3000 > > >> > > > > > > > >>>>>>>>>>>>> produce > > >> > > > > > > > >>>>>>>>>>>>>>>>>> requests. > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>> Now > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> let's investigate the io thread > > that > > >> > > > > > > > >>>>> finally > > >> > > > > > > > >>>>>>>>>> processed > > >> > > > > > > > >>>>>>>>>>>> the > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>> LeaderAndISR. > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> For the 3000 produce requests, > if > > >> > > > > > > > >> we > > >> > > > > > > > >>>>> model > > >> > > > > > > > >>>>>>>> the > > >> > > > > > > > >>>>>>>>>> time > > >> > > > > > > > >>>>>>>>>>>> when > > >> > > > > > > > >>>>>>>>>>>>>>> their > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> remaining > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> 19 partitions catch up as t0, t1, > > >> > > > > > > > >>>> ...t2999, > > >> > > > > > > > >>>>>> and > > >> > > > > > > > >>>>>>>>> the > > >> > > > > > > > >>>>>>>>>>>>>>> LeaderAndISR > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>> request > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> is > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> processed at time t3000. > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> Without this KIP, the 1st > produce > > >> > > > > > > > >>>> request > > >> > > > > > > > >>>>>>>> would > > >> > > > > > > > >>>>>>>>>> have > > >> > > > > > > > >>>>>>>>>>>>>> waited > > >> > > > > > > > >>>>>>>>>>>>>>> an > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> extra > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> t3000 - t0 time in the purgatory, > > the > > >> > > > > > > > >>> 2nd > > >> > > > > > > > >>>>> an > > >> > > > > > > > >>>>>>>> extra > > >> > > > > > > > >>>>>>>>>>> time > > >> > > > > > > > >>>>>>>>>>>> of > > >> > > > > > > > >>>>>>>>>>>>>>>>> t3000 - > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> t1, > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> etc. > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> Roughly speaking, the latency > > >> > > > > > > > >>>> difference > > >> > > > > > > > >>>>> is > > >> > > > > > > > >>>>>>>>> bigger > > >> > > > > > > > >>>>>>>>>>> for > > >> > > > > > > > >>>>>>>>>>>>> the > > >> > > > > > > > >>>>>>>>>>>>>>>>>> earlier > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> produce requests than for the > later > > >> > > > > > > > >>> ones. > > >> > > > > > > > >>>>> For > > >> > > > > > > > >>>>>>>> the > > >> > > > > > > > >>>>>>>>>> same > > >> > > > > > > > >>>>>>>>>>>>>> reason, > > >> > > > > > > > >>>>>>>>>>>>>>>>> the > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> more > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> ProduceRequests queued > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> before the LeaderAndISR, the > > bigger > > >> > > > > > > > >>>>> benefit > > >> > > > > > > > >>>>>>>> we > > >> > > > > > > > >>>>>>>>> get > > >> > > > > > > > >>>>>>>>>>>>> (capped > > >> > > > > > > > >>>>>>>>>>>>>>> by > > >> > > > > > > > >>>>>>>>>>>>>>>>> the > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> produce timeout). > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> 2. If the enqueued produce > requests > > >> > > > > > > > >>> have > > >> > > > > > > > >>>>>>>> acks=0 or > > >> > > > > > > > >>>>>>>>>>>> acks=1 > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> There will be no latency > > >> > > > > > > > >> differences > > >> > > > > > > > >>> in > > >> > > > > > > > >>>>>> this > > >> > > > > > > > >>>>>>>>> case, > > >> > > > > > > > >>>>>>>>>>> but > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> 2.1 without this KIP, the > records > > >> > > > > > > > >> of > > >> > > > > > > > >>>>>>>> partition > > >> > > > > > > > >>>>>>>>>>> test-0 > > >> > > > > > > > >>>>>>>>>>>> in > > >> > > > > > > > >>>>>>>>>>>>>> the > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> ProduceRequests ahead of the > > >> > > > > > > > >>> LeaderAndISR > > >> > > > > > > > >>>>>> will > > >> > > > > > > > >>>>>>>> be > > >> > > > > > > > >>>>>>>>>>>> appended > > >> > > > > > > > >>>>>>>>>>>>>> to > > >> > > > > > > > >>>>>>>>>>>>>>>>> the > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> local > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>> log, > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> and eventually be > truncated > > >> > > > > > > > >>> after > > >> > > > > > > > >>>>>>>>> processing > > >> > > > > > > > >>>>>>>>>>> the > > >> > > > > > > > >>>>>>>>>>>>>>>>>>> LeaderAndISR. > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> This is what's referred to as > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> "some unofficial > definition > > >> > > > > > > > >> of > > >> > > > > > > > >>>> data > > >> > > > > > > > >>>>>>>> loss > > >> > > > > > > > >>>>>>>>> in > > >> > > > > > > > >>>>>>>>>>>> terms > > >> > > > > > > > >>>>>>>>>>>>> of > > >> > > > > > > > >>>>>>>>>>>>>>>>>> messages > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> beyond the high watermark". > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> 2.2 with this KIP, we can > mitigate > > >> > > > > > > > >>> the > > >> > > > > > > > >>>>>> effect > > >> > > > > > > > >>>>>>>>>> since > > >> > > > > > > > >>>>>>>>>>> if > > >> > > > > > > > >>>>>>>>>>>>> the > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>> LeaderAndISR > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> is immediately processed, the > > >> > > > > > > > >> response > > >> > > > > > > > >>> to > > >> > > > > > > > >>>>>>>>> producers > > >> > > > > > > > >>>>>>>>>>> will > > >> > > > > > > > >>>>>>>>>>>>>> have > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> the NotLeaderForPartition > > >> > > > > > > > >>> error, > > >> > > > > > > > >>>>>>>> causing > > >> > > > > > > > >>>>>>>>>>>> producers > > >> > > > > > > > >>>>>>>>>>>>>> to > > >> > > > > > > > >>>>>>>>>>>>>>>>> retry > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> > > >> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> This explanation above is the > > benefit > > >> > > > > > > > >>> for > > >> > > > > > > > >>>>>>>> reducing > > >> > > > >> > > >> > > >> -- > > >> -Regards, > > >> Mayuresh R. Gharat > > >> (862) 250-7125 > > >> > > > > > > > > > --00000000000092654105723c9e19--