kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Lucas Wang <lucasatu...@gmail.com>
Subject Re: [DISCUSS] KIP-291: Have separate queues for control requests and data requests
Date Thu, 26 Jul 2018 05:23:24 GMT
Hi All,

I've updated the KIP by adding the dedicated endpoints for controller
connections,
and pinning threads for controller requests.
Also I've updated the title of this KIP. Please take a look and let me know
your feedback.

Thanks a lot for your time!
Lucas

On Tue, Jul 24, 2018 at 10:19 AM, Mayuresh Gharat <
gharatmayuresh15@gmail.com> wrote:

> Hi Lucas,
> I agree, if we want to go forward with a separate controller plane and data
> plane and completely isolate them, having a separate port for controller
> with a separate Acceptor and a Processor sounds ideal to me.
>
> Thanks,
>
> Mayuresh
>
>
> On Mon, Jul 23, 2018 at 11:04 PM Becket Qin <becket.qin@gmail.com> wrote:
>
> > Hi Lucas,
> >
> > Yes, I agree that a dedicated end to end control flow would be ideal.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Tue, Jul 24, 2018 at 1:05 PM, Lucas Wang <lucasatucla@gmail.com>
> wrote:
> >
> > > Thanks for the comment, Becket.
> > > So far, we've been trying to avoid making any request handler thread
> > > special.
> > > But if we were to follow that path in order to make the two planes more
> > > isolated,
> > > what do you think about also having a dedicated processor thread,
> > > and dedicated port for the controller?
> > >
> > > Today one processor thread can handle multiple connections, let's say
> 100
> > > connections
> > >
> > > represented by connection0, ... connection99, among which
> connection0-98
> > > are from clients, while connection99 is from
> > >
> > > the controller. Further let's say after one selector polling, there are
> > > incoming requests on all connections.
> > >
> > > When the request queue is full, (either the data request being full in
> > the
> > > two queue design, or
> > >
> > > the one single queue being full in the deque design), the processor
> > thread
> > > will be blocked first
> > >
> > > when trying to enqueue the data request from connection0, then possibly
> > > blocked for the data request
> > >
> > > from connection1, ... etc even though the controller request is ready
> to
> > be
> > > enqueued.
> > >
> > > To solve this problem, it seems we would need to have a separate port
> > > dedicated to
> > >
> > > the controller, a dedicated processor thread, a dedicated controller
> > > request queue,
> > >
> > > and pinning of one request handler thread for controller requests.
> > >
> > > Thanks,
> > > Lucas
> > >
> > >
> > > On Mon, Jul 23, 2018 at 6:00 PM, Becket Qin <becket.qin@gmail.com>
> > wrote:
> > >
> > > > Personally I am not fond of the dequeue approach simply because it is
> > > > against the basic idea of isolating the controller plane and data
> > plane.
> > > > With a single dequeue, theoretically speaking the controller requests
> > can
> > > > starve the clients requests. I would prefer the approach with a
> > separate
> > > > controller request queue and a dedicated controller request handler
> > > thread.
> > > >
> > > > Thanks,
> > > >
> > > > Jiangjie (Becket) Qin
> > > >
> > > > On Tue, Jul 24, 2018 at 8:16 AM, Lucas Wang <lucasatucla@gmail.com>
> > > wrote:
> > > >
> > > > > Sure, I can summarize the usage of correlation id. But before I do
> > > that,
> > > > it
> > > > > seems
> > > > > the same out-of-order processing can also happen to Produce
> requests
> > > sent
> > > > > by producers,
> > > > > following the same example you described earlier.
> > > > > If that's the case, I think this probably deserves a separate doc
> and
> > > > > design independent of this KIP.
> > > > >
> > > > > Lucas
> > > > >
> > > > >
> > > > >
> > > > > On Mon, Jul 23, 2018 at 12:39 PM, Dong Lin <lindong28@gmail.com>
> > > wrote:
> > > > >
> > > > > > Hey Lucas,
> > > > > >
> > > > > > Could you update the KIP if you are confident with the approach
> > which
> > > > > uses
> > > > > > correlation id? The idea around correlation id is kind of
> scattered
> > > > > across
> > > > > > multiple emails. It will be useful if other reviews can read the
> > KIP
> > > to
> > > > > > understand the latest proposal.
> > > > > >
> > > > > > Thanks,
> > > > > > Dong
> > > > > >
> > > > > > On Mon, Jul 23, 2018 at 12:32 PM, Mayuresh Gharat <
> > > > > > gharatmayuresh15@gmail.com> wrote:
> > > > > >
> > > > > > > I like the idea of the dequeue implementation by Lucas. This
> will
> > > > help
> > > > > us
> > > > > > > avoid additional queue for controller and additional configs in
> > > > Kafka.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Mayuresh
> > > > > > >
> > > > > > > On Sun, Jul 22, 2018 at 2:58 AM Becket Qin <
> becket.qin@gmail.com
> > >
> > > > > wrote:
> > > > > > >
> > > > > > > > Hi Jun,
> > > > > > > >
> > > > > > > > The usage of correlation ID might still be useful to address
> > the
> > > > > cases
> > > > > > > > that the controller epoch and leader epoch check are not
> > > sufficient
> > > > > to
> > > > > > > > guarantee correct behavior. For example, if the controller
> > sends
> > > a
> > > > > > > > LeaderAndIsrRequest followed by a StopReplicaRequest, and the
> > > > broker
> > > > > > > > processes it in the reverse order, the replica may still be
> > > wrongly
> > > > > > > > recreated, right?
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Jiangjie (Becket) Qin
> > > > > > > >
> > > > > > > > > On Jul 22, 2018, at 11:47 AM, Jun Rao <jun@confluent.io>
> > > wrote:
> > > > > > > > >
> > > > > > > > > Hmm, since we already use controller epoch and leader epoch
> > for
> > > > > > > properly
> > > > > > > > > caching the latest partition state, do we really need
> > > correlation
> > > > > id
> > > > > > > for
> > > > > > > > > ordering the controller requests?
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Jun
> > > > > > > > >
> > > > > > > > > On Fri, Jul 20, 2018 at 2:18 PM, Becket Qin <
> > > > becket.qin@gmail.com>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > >> Lucas and Mayuresh,
> > > > > > > > >>
> > > > > > > > >> Good idea. The correlation id should work.
> > > > > > > > >>
> > > > > > > > >> In the ControllerChannelManager, a request will be resent
> > > until
> > > > a
> > > > > > > > response
> > > > > > > > >> is received. So if the controller to broker connection
> > > > disconnects
> > > > > > > after
> > > > > > > > >> controller sends R1_a, but before the response of R1_a is
> > > > > received,
> > > > > > a
> > > > > > > > >> disconnection may cause the controller to resend R1_b.
> i.e.
> > > > until
> > > > > R1
> > > > > > > is
> > > > > > > > >> acked, R2 won't be sent by the controller.
> > > > > > > > >> This gives two guarantees:
> > > > > > > > >> 1. Correlation id wise: R1_a < R1_b < R2.
> > > > > > > > >> 2. On the broker side, when R2 is seen, R1 must have been
> > > > > processed
> > > > > > at
> > > > > > > > >> least once.
> > > > > > > > >>
> > > > > > > > >> So on the broker side, with a single thread controller
> > request
> > > > > > > handler,
> > > > > > > > the
> > > > > > > > >> logic should be:
> > > > > > > > >> 1. Process what ever request seen in the controller
> request
> > > > queue
> > > > > > > > >> 2. For the given epoch, drop request if its correlation id
> > is
> > > > > > smaller
> > > > > > > > than
> > > > > > > > >> that of the last processed request.
> > > > > > > > >>
> > > > > > > > >> Thanks,
> > > > > > > > >>
> > > > > > > > >> Jiangjie (Becket) Qin
> > > > > > > > >>
> > > > > > > > >> On Fri, Jul 20, 2018 at 8:07 AM, Jun Rao <
> jun@confluent.io>
> > > > > wrote:
> > > > > > > > >>
> > > > > > > > >>> I agree that there is no strong ordering when there are
> > more
> > > > than
> > > > > > one
> > > > > > > > >>> socket connections. Currently, we rely on controllerEpoch
> > and
> > > > > > > > leaderEpoch
> > > > > > > > >>> to ensure that the receiving broker picks up the latest
> > state
> > > > for
> > > > > > > each
> > > > > > > > >>> partition.
> > > > > > > > >>>
> > > > > > > > >>> One potential issue with the dequeue approach is that if
> > the
> > > > > queue
> > > > > > is
> > > > > > > > >> full,
> > > > > > > > >>> there is no guarantee that the controller requests will
> be
> > > > > enqueued
> > > > > > > > >>> quickly.
> > > > > > > > >>>
> > > > > > > > >>> Thanks,
> > > > > > > > >>>
> > > > > > > > >>> Jun
> > > > > > > > >>>
> > > > > > > > >>> On Fri, Jul 20, 2018 at 5:25 AM, Mayuresh Gharat <
> > > > > > > > >>> gharatmayuresh15@gmail.com
> > > > > > > > >>>> wrote:
> > > > > > > > >>>
> > > > > > > > >>>> Yea, the correlationId is only set to 0 in the
> > NetworkClient
> > > > > > > > >> constructor.
> > > > > > > > >>>> Since we reuse the same NetworkClient between Controller
> > and
> > > > the
> > > > > > > > >> broker,
> > > > > > > > >>> a
> > > > > > > > >>>> disconnection should not cause it to reset to 0, in
> which
> > > case
> > > > > it
> > > > > > > can
> > > > > > > > >> be
> > > > > > > > >>>> used to reject obsolete requests.
> > > > > > > > >>>>
> > > > > > > > >>>> Thanks,
> > > > > > > > >>>>
> > > > > > > > >>>> Mayuresh
> > > > > > > > >>>>
> > > > > > > > >>>> On Thu, Jul 19, 2018 at 1:52 PM Lucas Wang <
> > > > > lucasatucla@gmail.com
> > > > > > >
> > > > > > > > >>> wrote:
> > > > > > > > >>>>
> > > > > > > > >>>>> @Dong,
> > > > > > > > >>>>> Great example and explanation, thanks!
> > > > > > > > >>>>>
> > > > > > > > >>>>> @All
> > > > > > > > >>>>> Regarding the example given by Dong, it seems even if
> we
> > > use
> > > > a
> > > > > > > queue,
> > > > > > > > >>>> and a
> > > > > > > > >>>>> dedicated controller request handling thread,
> > > > > > > > >>>>> the same result can still happen because R1_a will be
> > sent
> > > on
> > > > > one
> > > > > > > > >>>>> connection, and R1_b & R2 will be sent on a different
> > > > > connection,
> > > > > > > > >>>>> and there is no ordering between different connections
> on
> > > the
> > > > > > > broker
> > > > > > > > >>>> side.
> > > > > > > > >>>>> I was discussing with Mayuresh offline, and it seems
> > > > > correlation
> > > > > > id
> > > > > > > > >>>> within
> > > > > > > > >>>>> the same NetworkClient object is monotonically
> increasing
> > > and
> > > > > > never
> > > > > > > > >>>> reset,
> > > > > > > > >>>>> hence a broker can leverage that to properly reject
> > > obsolete
> > > > > > > > >> requests.
> > > > > > > > >>>>> Thoughts?
> > > > > > > > >>>>>
> > > > > > > > >>>>> Thanks,
> > > > > > > > >>>>> Lucas
> > > > > > > > >>>>>
> > > > > > > > >>>>> On Thu, Jul 19, 2018 at 12:11 PM, Mayuresh Gharat <
> > > > > > > > >>>>> gharatmayuresh15@gmail.com> wrote:
> > > > > > > > >>>>>
> > > > > > > > >>>>>> Actually nvm, correlationId is reset in case of
> > connection
> > > > > > loss, I
> > > > > > > > >>>> think.
> > > > > > > > >>>>>>
> > > > > > > > >>>>>> Thanks,
> > > > > > > > >>>>>>
> > > > > > > > >>>>>> Mayuresh
> > > > > > > > >>>>>>
> > > > > > > > >>>>>> On Thu, Jul 19, 2018 at 11:11 AM Mayuresh Gharat <
> > > > > > > > >>>>>> gharatmayuresh15@gmail.com>
> > > > > > > > >>>>>> wrote:
> > > > > > > > >>>>>>
> > > > > > > > >>>>>>> I agree with Dong that out-of-order processing can
> > happen
> > > > > with
> > > > > > > > >>>> having 2
> > > > > > > > >>>>>>> separate queues as well and it can even happen today.
> > > > > > > > >>>>>>> Can we use the correlationId in the request from the
> > > > > controller
> > > > > > > > >> to
> > > > > > > > >>>> the
> > > > > > > > >>>>>>> broker to handle ordering ?
> > > > > > > > >>>>>>>
> > > > > > > > >>>>>>> Thanks,
> > > > > > > > >>>>>>>
> > > > > > > > >>>>>>> Mayuresh
> > > > > > > > >>>>>>>
> > > > > > > > >>>>>>>
> > > > > > > > >>>>>>> On Thu, Jul 19, 2018 at 6:41 AM Becket Qin <
> > > > > > becket.qin@gmail.com
> > > > > > > > >>>
> > > > > > > > >>>>> wrote:
> > > > > > > > >>>>>>>
> > > > > > > > >>>>>>>> Good point, Joel. I agree that a dedicated
> controller
> > > > > request
> > > > > > > > >>>> handling
> > > > > > > > >>>>>>>> thread would be a better isolation. It also solves
> the
> > > > > > > > >> reordering
> > > > > > > > >>>>> issue.
> > > > > > > > >>>>>>>>
> > > > > > > > >>>>>>>> On Thu, Jul 19, 2018 at 2:23 PM, Joel Koshy <
> > > > > > > > >> jjkoshy.w@gmail.com>
> > > > > > > > >>>>>> wrote:
> > > > > > > > >>>>>>>>
> > > > > > > > >>>>>>>>> Good example. I think this scenario can occur in
> the
> > > > > current
> > > > > > > > >>> code
> > > > > > > > >>>> as
> > > > > > > > >>>>>>>> well
> > > > > > > > >>>>>>>>> but with even lower probability given that there
> are
> > > > other
> > > > > > > > >>>>>>>> non-controller
> > > > > > > > >>>>>>>>> requests interleaved. It is still sketchy though
> and
> > I
> > > > > think
> > > > > > a
> > > > > > > > >>>> safer
> > > > > > > > >>>>>>>>> approach would be separate queues and pinning
> > > controller
> > > > > > > > >> request
> > > > > > > > >>>>>>>> handling
> > > > > > > > >>>>>>>>> to one handler thread.
> > > > > > > > >>>>>>>>>
> > > > > > > > >>>>>>>>> On Wed, Jul 18, 2018 at 11:12 PM, Dong Lin <
> > > > > > > > >> lindong28@gmail.com
> > > > > > > > >>>>
> > > > > > > > >>>>>> wrote:
> > > > > > > > >>>>>>>>>
> > > > > > > > >>>>>>>>>> Hey Becket,
> > > > > > > > >>>>>>>>>>
> > > > > > > > >>>>>>>>>> I think you are right that there may be
> out-of-order
> > > > > > > > >>> processing.
> > > > > > > > >>>>>>>> However,
> > > > > > > > >>>>>>>>>> it seems that out-of-order processing may also
> > happen
> > > > even
> > > > > > > > >> if
> > > > > > > > >>> we
> > > > > > > > >>>>>> use a
> > > > > > > > >>>>>>>>>> separate queue.
> > > > > > > > >>>>>>>>>>
> > > > > > > > >>>>>>>>>> Here is the example:
> > > > > > > > >>>>>>>>>>
> > > > > > > > >>>>>>>>>> - Controller sends R1 and got disconnected before
> > > > > receiving
> > > > > > > > >>>>>> response.
> > > > > > > > >>>>>>>>> Then
> > > > > > > > >>>>>>>>>> it reconnects and sends R2. Both requests now stay
> > in
> > > > the
> > > > > > > > >>>>> controller
> > > > > > > > >>>>>>>>>> request queue in the order they are sent.
> > > > > > > > >>>>>>>>>> - thread1 takes R1_a from the request queue and
> then
> > > > > thread2
> > > > > > > > >>>> takes
> > > > > > > > >>>>>> R2
> > > > > > > > >>>>>>>>> from
> > > > > > > > >>>>>>>>>> the request queue almost at the same time.
> > > > > > > > >>>>>>>>>> - So R1_a and R2 are processed in parallel. There
> is
> > > > > chance
> > > > > > > > >>> that
> > > > > > > > >>>>>> R2's
> > > > > > > > >>>>>>>>>> processing is completed before R1.
> > > > > > > > >>>>>>>>>>
> > > > > > > > >>>>>>>>>> If out-of-order processing can happen for both
> > > > approaches
> > > > > > > > >> with
> > > > > > > > >>>>> very
> > > > > > > > >>>>>>>> low
> > > > > > > > >>>>>>>>>> probability, it may not be worthwhile to add the
> > extra
> > > > > > > > >> queue.
> > > > > > > > >>>> What
> > > > > > > > >>>>>> do
> > > > > > > > >>>>>>>> you
> > > > > > > > >>>>>>>>>> think?
> > > > > > > > >>>>>>>>>>
> > > > > > > > >>>>>>>>>> Thanks,
> > > > > > > > >>>>>>>>>> Dong
> > > > > > > > >>>>>>>>>>
> > > > > > > > >>>>>>>>>>
> > > > > > > > >>>>>>>>>> On Wed, Jul 18, 2018 at 6:17 PM, Becket Qin <
> > > > > > > > >>>> becket.qin@gmail.com
> > > > > > > > >>>>>>
> > > > > > > > >>>>>>>>> wrote:
> > > > > > > > >>>>>>>>>>
> > > > > > > > >>>>>>>>>>> Hi Mayuresh/Joel,
> > > > > > > > >>>>>>>>>>>
> > > > > > > > >>>>>>>>>>> Using the request channel as a dequeue was bright
> > up
> > > > some
> > > > > > > > >>> time
> > > > > > > > >>>>> ago
> > > > > > > > >>>>>>>> when
> > > > > > > > >>>>>>>>>> we
> > > > > > > > >>>>>>>>>>> initially thinking of prioritizing the request.
> The
> > > > > > > > >> concern
> > > > > > > > >>>> was
> > > > > > > > >>>>>> that
> > > > > > > > >>>>>>>>> the
> > > > > > > > >>>>>>>>>>> controller requests are supposed to be processed
> in
> > > > > order.
> > > > > > > > >>> If
> > > > > > > > >>>> we
> > > > > > > > >>>>>> can
> > > > > > > > >>>>>>>>>> ensure
> > > > > > > > >>>>>>>>>>> that there is one controller request in the
> request
> > > > > > > > >> channel,
> > > > > > > > >>>> the
> > > > > > > > >>>>>>>> order
> > > > > > > > >>>>>>>>> is
> > > > > > > > >>>>>>>>>>> not a concern. But in cases that there are more
> > than
> > > > one
> > > > > > > > >>>>>> controller
> > > > > > > > >>>>>>>>>> request
> > > > > > > > >>>>>>>>>>> inserted into the queue, the controller request
> > order
> > > > may
> > > > > > > > >>>> change
> > > > > > > > >>>>>> and
> > > > > > > > >>>>>>>>>> cause
> > > > > > > > >>>>>>>>>>> problem. For example, think about the following
> > > > sequence:
> > > > > > > > >>>>>>>>>>> 1. Controller successfully sent a request R1 to
> > > broker
> > > > > > > > >>>>>>>>>>> 2. Broker receives R1 and put the request to the
> > head
> > > > of
> > > > > > > > >> the
> > > > > > > > >>>>>> request
> > > > > > > > >>>>>>>>>> queue.
> > > > > > > > >>>>>>>>>>> 3. Controller to broker connection failed and the
> > > > > > > > >> controller
> > > > > > > > >>>>>>>>> reconnected
> > > > > > > > >>>>>>>>>> to
> > > > > > > > >>>>>>>>>>> the broker.
> > > > > > > > >>>>>>>>>>> 4. Controller sends a request R2 to the broker
> > > > > > > > >>>>>>>>>>> 5. Broker receives R2 and add it to the head of
> the
> > > > > > > > >> request
> > > > > > > > >>>>> queue.
> > > > > > > > >>>>>>>>>>> Now on the broker side, R2 will be processed
> before
> > > R1
> > > > is
> > > > > > > > >>>>>> processed,
> > > > > > > > >>>>>>>>>> which
> > > > > > > > >>>>>>>>>>> may cause problem.
> > > > > > > > >>>>>>>>>>>
> > > > > > > > >>>>>>>>>>> Thanks,
> > > > > > > > >>>>>>>>>>>
> > > > > > > > >>>>>>>>>>> Jiangjie (Becket) Qin
> > > > > > > > >>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>
> > > > > > > > >>>>>>>>>>> On Thu, Jul 19, 2018 at 3:23 AM, Joel Koshy <
> > > > > > > > >>>>> jjkoshy.w@gmail.com>
> > > > > > > > >>>>>>>>> wrote:
> > > > > > > > >>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>> @Mayuresh - I like your idea. It appears to be a
> > > > simpler
> > > > > > > > >>>> less
> > > > > > > > >>>>>>>>> invasive
> > > > > > > > >>>>>>>>>>>> alternative and it should work.
> Jun/Becket/others,
> > > do
> > > > > > > > >> you
> > > > > > > > >>>> see
> > > > > > > > >>>>>> any
> > > > > > > > >>>>>>>>>>> pitfalls
> > > > > > > > >>>>>>>>>>>> with this approach?
> > > > > > > > >>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>> On Wed, Jul 18, 2018 at 12:03 PM, Lucas Wang <
> > > > > > > > >>>>>>>> lucasatucla@gmail.com>
> > > > > > > > >>>>>>>>>>>> wrote:
> > > > > > > > >>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>> @Mayuresh,
> > > > > > > > >>>>>>>>>>>>> That's a very interesting idea that I haven't
> > > thought
> > > > > > > > >>>>> before.
> > > > > > > > >>>>>>>>>>>>> It seems to solve our problem at hand pretty
> > well,
> > > > and
> > > > > > > > >>>> also
> > > > > > > > >>>>>>>>>>>>> avoids the need to have a new size metric and
> > > > capacity
> > > > > > > > >>>>> config
> > > > > > > > >>>>>>>>>>>>> for the controller request queue. In fact, if
> we
> > > were
> > > > > > > > >> to
> > > > > > > > >>>>> adopt
> > > > > > > > >>>>>>>>>>>>> this design, there is no public interface
> change,
> > > and
> > > > > > > > >> we
> > > > > > > > >>>>>>>>>>>>> probably don't need a KIP.
> > > > > > > > >>>>>>>>>>>>> Also implementation wise, it seems
> > > > > > > > >>>>>>>>>>>>> the java class LinkedBlockingQueue can readily
> > > > satisfy
> > > > > > > > >>> the
> > > > > > > > >>>>>>>>>> requirement
> > > > > > > > >>>>>>>>>>>>> by supporting a capacity, and also allowing
> > > inserting
> > > > > > > > >> at
> > > > > > > > >>>>> both
> > > > > > > > >>>>>>>> ends.
> > > > > > > > >>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>> My only concern is that this design is tied to
> > the
> > > > > > > > >>>>> coincidence
> > > > > > > > >>>>>>>> that
> > > > > > > > >>>>>>>>>>>>> we have two request priorities and there are
> two
> > > ends
> > > > > > > > >>> to a
> > > > > > > > >>>>>>>> deque.
> > > > > > > > >>>>>>>>>>>>> Hence by using the proposed design, it seems
> the
> > > > > > > > >> network
> > > > > > > > >>>>> layer
> > > > > > > > >>>>>>>> is
> > > > > > > > >>>>>>>>>>>>> more tightly coupled with upper layer logic,
> e.g.
> > > if
> > > > > > > > >> we
> > > > > > > > >>>> were
> > > > > > > > >>>>>> to
> > > > > > > > >>>>>>>> add
> > > > > > > > >>>>>>>>>>>>> an extra priority level in the future for some
> > > > reason,
> > > > > > > > >>> we
> > > > > > > > >>>>>> would
> > > > > > > > >>>>>>>>>>> probably
> > > > > > > > >>>>>>>>>>>>> need to go back to the design of separate
> queues,
> > > one
> > > > > > > > >>> for
> > > > > > > > >>>>> each
> > > > > > > > >>>>>>>>>> priority
> > > > > > > > >>>>>>>>>>>>> level.
> > > > > > > > >>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>> In summary, I'm ok with both designs and lean
> > > toward
> > > > > > > > >>> your
> > > > > > > > >>>>>>>> suggested
> > > > > > > > >>>>>>>>>>>>> approach.
> > > > > > > > >>>>>>>>>>>>> Let's hear what others think.
> > > > > > > > >>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>> @Becket,
> > > > > > > > >>>>>>>>>>>>> In light of Mayuresh's suggested new design,
> I'm
> > > > > > > > >>> answering
> > > > > > > > >>>>>> your
> > > > > > > > >>>>>>>>>>> question
> > > > > > > > >>>>>>>>>>>>> only in the context
> > > > > > > > >>>>>>>>>>>>> of the current KIP design: I think your
> > suggestion
> > > > > > > > >> makes
> > > > > > > > >>>>>> sense,
> > > > > > > > >>>>>>>> and
> > > > > > > > >>>>>>>>>> I'm
> > > > > > > > >>>>>>>>>>>> ok
> > > > > > > > >>>>>>>>>>>>> with removing the capacity config and
> > > > > > > > >>>>>>>>>>>>> just relying on the default value of 20 being
> > > > > > > > >> sufficient
> > > > > > > > >>>>>> enough.
> > > > > > > > >>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>> Thanks,
> > > > > > > > >>>>>>>>>>>>> Lucas
> > > > > > > > >>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>> On Wed, Jul 18, 2018 at 9:57 AM, Mayuresh
> Gharat
> > <
> > > > > > > > >>>>>>>>>>>>> gharatmayuresh15@gmail.com
> > > > > > > > >>>>>>>>>>>>>> wrote:
> > > > > > > > >>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>> Hi Lucas,
> > > > > > > > >>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>> Seems like the main intent here is to
> prioritize
> > > the
> > > > > > > > >>>>>>>> controller
> > > > > > > > >>>>>>>>>>> request
> > > > > > > > >>>>>>>>>>>>>> over any other requests.
> > > > > > > > >>>>>>>>>>>>>> In that case, we can change the request queue
> > to a
> > > > > > > > >>>>> dequeue,
> > > > > > > > >>>>>>>> where
> > > > > > > > >>>>>>>>>> you
> > > > > > > > >>>>>>>>>>>>>> always insert the normal requests (produce,
> > > > > > > > >>>> consume,..etc)
> > > > > > > > >>>>>> to
> > > > > > > > >>>>>>>> the
> > > > > > > > >>>>>>>>>> end
> > > > > > > > >>>>>>>>>>>> of
> > > > > > > > >>>>>>>>>>>>>> the dequeue, but if its a controller request,
> > you
> > > > > > > > >>> insert
> > > > > > > > >>>>> it
> > > > > > > > >>>>>> to
> > > > > > > > >>>>>>>>> the
> > > > > > > > >>>>>>>>>>> head
> > > > > > > > >>>>>>>>>>>>> of
> > > > > > > > >>>>>>>>>>>>>> the queue. This ensures that the controller
> > > request
> > > > > > > > >>> will
> > > > > > > > >>>>> be
> > > > > > > > >>>>>>>> given
> > > > > > > > >>>>>>>>>>>> higher
> > > > > > > > >>>>>>>>>>>>>> priority over other requests.
> > > > > > > > >>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>> Also since we only read one request from the
> > > socket
> > > > > > > > >>> and
> > > > > > > > >>>>> mute
> > > > > > > > >>>>>>>> it
> > > > > > > > >>>>>>>>> and
> > > > > > > > >>>>>>>>>>>> only
> > > > > > > > >>>>>>>>>>>>>> unmute it after handling the request, this
> would
> > > > > > > > >>> ensure
> > > > > > > > >>>>> that
> > > > > > > > >>>>>>>> we
> > > > > > > > >>>>>>>>>> don't
> > > > > > > > >>>>>>>>>>>>>> handle controller requests out of order.
> > > > > > > > >>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>> With this approach we can avoid the second
> queue
> > > and
> > > > > > > > >>> the
> > > > > > > > >>>>>>>>> additional
> > > > > > > > >>>>>>>>>>>>> config
> > > > > > > > >>>>>>>>>>>>>> for the size of the queue.
> > > > > > > > >>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>> What do you think ?
> > > > > > > > >>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>> Thanks,
> > > > > > > > >>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>> Mayuresh
> > > > > > > > >>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>> On Wed, Jul 18, 2018 at 3:05 AM Becket Qin <
> > > > > > > > >>>>>>>> becket.qin@gmail.com
> > > > > > > > >>>>>>>>>>
> > > > > > > > >>>>>>>>>>>> wrote:
> > > > > > > > >>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>> Hey Joel,
> > > > > > > > >>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>> Thank for the detail explanation. I agree the
> > > > > > > > >>> current
> > > > > > > > >>>>>> design
> > > > > > > > >>>>>>>>>> makes
> > > > > > > > >>>>>>>>>>>>> sense.
> > > > > > > > >>>>>>>>>>>>>>> My confusion is about whether the new config
> > for
> > > > > > > > >> the
> > > > > > > > >>>>>>>> controller
> > > > > > > > >>>>>>>>>>> queue
> > > > > > > > >>>>>>>>>>>>>>> capacity is necessary. I cannot think of a
> case
> > > in
> > > > > > > > >>>> which
> > > > > > > > >>>>>>>> users
> > > > > > > > >>>>>>>>>>> would
> > > > > > > > >>>>>>>>>>>>>> change
> > > > > > > > >>>>>>>>>>>>>>> it.
> > > > > > > > >>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>> Thanks,
> > > > > > > > >>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>> Jiangjie (Becket) Qin
> > > > > > > > >>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>> On Wed, Jul 18, 2018 at 6:00 PM, Becket Qin <
> > > > > > > > >>>>>>>>>> becket.qin@gmail.com>
> > > > > > > > >>>>>>>>>>>>>> wrote:
> > > > > > > > >>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>> Hi Lucas,
> > > > > > > > >>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>> I guess my question can be rephrased to "do
> we
> > > > > > > > >>>> expect
> > > > > > > > >>>>>>>> user to
> > > > > > > > >>>>>>>>>>> ever
> > > > > > > > >>>>>>>>>>>>>> change
> > > > > > > > >>>>>>>>>>>>>>>> the controller request queue capacity"? If
> we
> > > > > > > > >>> agree
> > > > > > > > >>>>> that
> > > > > > > > >>>>>>>> 20
> > > > > > > > >>>>>>>>> is
> > > > > > > > >>>>>>>>>>>>> already
> > > > > > > > >>>>>>>>>>>>>> a
> > > > > > > > >>>>>>>>>>>>>>>> very generous default number and we do not
> > > > > > > > >> expect
> > > > > > > > >>>> user
> > > > > > > > >>>>>> to
> > > > > > > > >>>>>>>>>> change
> > > > > > > > >>>>>>>>>>>> it,
> > > > > > > > >>>>>>>>>>>>> is
> > > > > > > > >>>>>>>>>>>>>>> it
> > > > > > > > >>>>>>>>>>>>>>>> still necessary to expose this as a config?
> > > > > > > > >>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>> Thanks,
> > > > > > > > >>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>> Jiangjie (Becket) Qin
> > > > > > > > >>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>> On Wed, Jul 18, 2018 at 2:29 AM, Lucas Wang
> <
> > > > > > > > >>>>>>>>>>> lucasatucla@gmail.com
> > > > > > > > >>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>> wrote:
> > > > > > > > >>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>>> @Becket
> > > > > > > > >>>>>>>>>>>>>>>>> 1. Thanks for the comment. You are right
> that
> > > > > > > > >>>>> normally
> > > > > > > > >>>>>>>> there
> > > > > > > > >>>>>>>>>>>> should
> > > > > > > > >>>>>>>>>>>>> be
> > > > > > > > >>>>>>>>>>>>>>>>> just
> > > > > > > > >>>>>>>>>>>>>>>>> one controller request because of muting,
> > > > > > > > >>>>>>>>>>>>>>>>> and I had NOT intended to say there would
> be
> > > > > > > > >> many
> > > > > > > > >>>>>>>> enqueued
> > > > > > > > >>>>>>>>>>>>> controller
> > > > > > > > >>>>>>>>>>>>>>>>> requests.
> > > > > > > > >>>>>>>>>>>>>>>>> I went through the KIP again, and I'm not
> > sure
> > > > > > > > >>>> which
> > > > > > > > >>>>>> part
> > > > > > > > >>>>>>>>>>> conveys
> > > > > > > > >>>>>>>>>>>>> that
> > > > > > > > >>>>>>>>>>>>>>>>> info.
> > > > > > > > >>>>>>>>>>>>>>>>> I'd be happy to revise if you point it out
> > the
> > > > > > > > >>>>> section.
> > > > > > > > >>>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>>> 2. Though it should not happen in normal
> > > > > > > > >>>> conditions,
> > > > > > > > >>>>>> the
> > > > > > > > >>>>>>>>>> current
> > > > > > > > >>>>>>>>>>>>>> design
> > > > > > > > >>>>>>>>>>>>>>>>> does not preclude multiple controllers
> > running
> > > > > > > > >>>>>>>>>>>>>>>>> at the same time, hence if we don't have
> the
> > > > > > > > >>>>> controller
> > > > > > > > >>>>>>>>> queue
> > > > > > > > >>>>>>>>>>>>> capacity
> > > > > > > > >>>>>>>>>>>>>>>>> config and simply make its capacity to be
> 1,
> > > > > > > > >>>>>>>>>>>>>>>>> network threads handling requests from
> > > > > > > > >> different
> > > > > > > > >>>>>>>> controllers
> > > > > > > > >>>>>>>>>>> will
> > > > > > > > >>>>>>>>>>>> be
> > > > > > > > >>>>>>>>>>>>>>>>> blocked during those troublesome times,
> > > > > > > > >>>>>>>>>>>>>>>>> which is probably not what we want. On the
> > > > > > > > >> other
> > > > > > > > >>>>> hand,
> > > > > > > > >>>>>>>>> adding
> > > > > > > > >>>>>>>>>>> the
> > > > > > > > >>>>>>>>>>>>>> extra
> > > > > > > > >>>>>>>>>>>>>>>>> config with a default value, say 20, guards
> > us
> > > > > > > > >>> from
> > > > > > > > >>>>>>>> issues
> > > > > > > > >>>>>>>>> in
> > > > > > > > >>>>>>>>>>>> those
> > > > > > > > >>>>>>>>>>>>>>>>> troublesome times, and IMO there isn't much
> > > > > > > > >>>> downside
> > > > > > > > >>>>> of
> > > > > > > > >>>>>>>>> adding
> > > > > > > > >>>>>>>>>>> the
> > > > > > > > >>>>>>>>>>>>>> extra
> > > > > > > > >>>>>>>>>>>>>>>>> config.
> > > > > > > > >>>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>>> @Mayuresh
> > > > > > > > >>>>>>>>>>>>>>>>> Good catch, this sentence is an obsolete
> > > > > > > > >>> statement
> > > > > > > > >>>>>> based
> > > > > > > > >>>>>>>> on
> > > > > > > > >>>>>>>>> a
> > > > > > > > >>>>>>>>>>>>> previous
> > > > > > > > >>>>>>>>>>>>>>>>> design. I've revised the wording in the
> KIP.
> > > > > > > > >>>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>>> Thanks,
> > > > > > > > >>>>>>>>>>>>>>>>> Lucas
> > > > > > > > >>>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>>> On Tue, Jul 17, 2018 at 10:33 AM, Mayuresh
> > > > > > > > >>> Gharat <
> > > > > > > > >>>>>>>>>>>>>>>>> gharatmayuresh15@gmail.com> wrote:
> > > > > > > > >>>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>>>> Hi Lucas,
> > > > > > > > >>>>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>>>> Thanks for the KIP.
> > > > > > > > >>>>>>>>>>>>>>>>>> I am trying to understand why you think
> "The
> > > > > > > > >>>> memory
> > > > > > > > >>>>>>>>>>> consumption
> > > > > > > > >>>>>>>>>>>>> can
> > > > > > > > >>>>>>>>>>>>>>> rise
> > > > > > > > >>>>>>>>>>>>>>>>>> given the total number of queued requests
> > can
> > > > > > > > >>> go
> > > > > > > > >>>> up
> > > > > > > > >>>>>> to
> > > > > > > > >>>>>>>> 2x"
> > > > > > > > >>>>>>>>>> in
> > > > > > > > >>>>>>>>>>>> the
> > > > > > > > >>>>>>>>>>>>>>> impact
> > > > > > > > >>>>>>>>>>>>>>>>>> section. Normally the requests from
> > > > > > > > >> controller
> > > > > > > > >>>> to a
> > > > > > > > >>>>>>>> Broker
> > > > > > > > >>>>>>>>>> are
> > > > > > > > >>>>>>>>>>>> not
> > > > > > > > >>>>>>>>>>>>>>> high
> > > > > > > > >>>>>>>>>>>>>>>>>> volume, right ?
> > > > > > > > >>>>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>>>> Thanks,
> > > > > > > > >>>>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>>>> Mayuresh
> > > > > > > > >>>>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>>>> On Tue, Jul 17, 2018 at 5:06 AM Becket
> Qin <
> > > > > > > > >>>>>>>>>>>> becket.qin@gmail.com>
> > > > > > > > >>>>>>>>>>>>>>>>> wrote:
> > > > > > > > >>>>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>>>>> Thanks for the KIP, Lucas. Separating the
> > > > > > > > >>>> control
> > > > > > > > >>>>>>>> plane
> > > > > > > > >>>>>>>>>> from
> > > > > > > > >>>>>>>>>>>> the
> > > > > > > > >>>>>>>>>>>>>>> data
> > > > > > > > >>>>>>>>>>>>>>>>>> plane
> > > > > > > > >>>>>>>>>>>>>>>>>>> makes a lot of sense.
> > > > > > > > >>>>>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>>>>> In the KIP you mentioned that the
> > > > > > > > >> controller
> > > > > > > > >>>>>> request
> > > > > > > > >>>>>>>>> queue
> > > > > > > > >>>>>>>>>>> may
> > > > > > > > >>>>>>>>>>>>>> have
> > > > > > > > >>>>>>>>>>>>>>>>> many
> > > > > > > > >>>>>>>>>>>>>>>>>>> requests in it. Will this be a common
> case?
> > > > > > > > >>> The
> > > > > > > > >>>>>>>>> controller
> > > > > > > > >>>>>>>>>>>>>> requests
> > > > > > > > >>>>>>>>>>>>>>>>> still
> > > > > > > > >>>>>>>>>>>>>>>>>>> goes through the SocketServer. The
> > > > > > > > >>> SocketServer
> > > > > > > > >>>>>> will
> > > > > > > > >>>>>>>>> mute
> > > > > > > > >>>>>>>>>>> the
> > > > > > > > >>>>>>>>>>>>>>> channel
> > > > > > > > >>>>>>>>>>>>>>>>>> once
> > > > > > > > >>>>>>>>>>>>>>>>>>> a request is read and put into the
> request
> > > > > > > > >>>>> channel.
> > > > > > > > >>>>>>>> So
> > > > > > > > >>>>>>>>>>>> assuming
> > > > > > > > >>>>>>>>>>>>>>> there
> > > > > > > > >>>>>>>>>>>>>>>>> is
> > > > > > > > >>>>>>>>>>>>>>>>>>> only one connection between controller
> and
> > > > > > > > >>> each
> > > > > > > > >>>>>>>> broker,
> > > > > > > > >>>>>>>>> on
> > > > > > > > >>>>>>>>>>> the
> > > > > > > > >>>>>>>>>>>>>>> broker
> > > > > > > > >>>>>>>>>>>>>>>>>> side,
> > > > > > > > >>>>>>>>>>>>>>>>>>> there should be only one controller
> request
> > > > > > > > >>> in
> > > > > > > > >>>>> the
> > > > > > > > >>>>>>>>>>> controller
> > > > > > > > >>>>>>>>>>>>>>> request
> > > > > > > > >>>>>>>>>>>>>>>>>> queue
> > > > > > > > >>>>>>>>>>>>>>>>>>> at any given time. If that is the case,
> do
> > > > > > > > >> we
> > > > > > > > >>>>> need
> > > > > > > > >>>>>> a
> > > > > > > > >>>>>>>>>>> separate
> > > > > > > > >>>>>>>>>>>>>>>>> controller
> > > > > > > > >>>>>>>>>>>>>>>>>>> request queue capacity config? The
> default
> > > > > > > > >>>> value
> > > > > > > > >>>>> 20
> > > > > > > > >>>>>>>>> means
> > > > > > > > >>>>>>>>>>> that
> > > > > > > > >>>>>>>>>>>>> we
> > > > > > > > >>>>>>>>>>>>>>>>> expect
> > > > > > > > >>>>>>>>>>>>>>>>>>> there are 20 controller switches to
> happen
> > > > > > > > >>> in a
> > > > > > > > >>>>>> short
> > > > > > > > >>>>>>>>>> period
> > > > > > > > >>>>>>>>>>>> of
> > > > > > > > >>>>>>>>>>>>>>> time.
> > > > > > > > >>>>>>>>>>>>>>>>> I
> > > > > > > > >>>>>>>>>>>>>>>>>> am
> > > > > > > > >>>>>>>>>>>>>>>>>>> not sure whether someone should increase
> > > > > > > > >> the
> > > > > > > > >>>>>>>> controller
> > > > > > > > >>>>>>>>>>>> request
> > > > > > > > >>>>>>>>>>>>>>> queue
> > > > > > > > >>>>>>>>>>>>>>>>>>> capacity to handle such case, as it seems
> > > > > > > > >>>>>> indicating
> > > > > > > > >>>>>>>>>>> something
> > > > > > > > >>>>>>>>>>>>>> very
> > > > > > > > >>>>>>>>>>>>>>>>> wrong
> > > > > > > > >>>>>>>>>>>>>>>>>>> has happened.
> > > > > > > > >>>>>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>>>>> Thanks,
> > > > > > > > >>>>>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>>>>> Jiangjie (Becket) Qin
> > > > > > > > >>>>>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>>>>> On Fri, Jul 13, 2018 at 1:10 PM, Dong
> Lin <
> > > > > > > > >>>>>>>>>>>> lindong28@gmail.com>
> > > > > > > > >>>>>>>>>>>>>>>>> wrote:
> > > > > > > > >>>>>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>>>>>> Thanks for the update Lucas.
> > > > > > > > >>>>>>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>>>>>> I think the motivation section is
> > > > > > > > >>> intuitive.
> > > > > > > > >>>> It
> > > > > > > > >>>>>>>> will
> > > > > > > > >>>>>>>>> be
> > > > > > > > >>>>>>>>>>> good
> > > > > > > > >>>>>>>>>>>>> to
> > > > > > > > >>>>>>>>>>>>>>>>> learn
> > > > > > > > >>>>>>>>>>>>>>>>>>> more
> > > > > > > > >>>>>>>>>>>>>>>>>>>> about the comments from other reviewers.
> > > > > > > > >>>>>>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>>>>>> On Thu, Jul 12, 2018 at 9:48 PM, Lucas
> > > > > > > > >>> Wang <
> > > > > > > > >>>>>>>>>>>>>>> lucasatucla@gmail.com>
> > > > > > > > >>>>>>>>>>>>>>>>>>> wrote:
> > > > > > > > >>>>>>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>>>>>>> Hi Dong,
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>>>>>>> I've updated the motivation section of
> > > > > > > > >>> the
> > > > > > > > >>>>> KIP
> > > > > > > > >>>>>> by
> > > > > > > > >>>>>>>>>>>> explaining
> > > > > > > > >>>>>>>>>>>>>> the
> > > > > > > > >>>>>>>>>>>>>>>>>> cases
> > > > > > > > >>>>>>>>>>>>>>>>>>>> that
> > > > > > > > >>>>>>>>>>>>>>>>>>>>> would have user impacts.
> > > > > > > > >>>>>>>>>>>>>>>>>>>>> Please take a look at let me know your
> > > > > > > > >>>>>> comments.
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>>>>>>> Thanks,
> > > > > > > > >>>>>>>>>>>>>>>>>>>>> Lucas
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>>>>>>> On Mon, Jul 9, 2018 at 5:53 PM, Lucas
> > > > > > > > >>> Wang
> > > > > > > > >>>> <
> > > > > > > > >>>>>>>>>>>>>>> lucasatucla@gmail.com
> > > > > > > > >>>>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>>>>>> wrote:
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> Hi Dong,
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> The simulation of disk being slow is
> > > > > > > > >>>> merely
> > > > > > > > >>>>>>>> for me
> > > > > > > > >>>>>>>>>> to
> > > > > > > > >>>>>>>>>>>>> easily
> > > > > > > > >>>>>>>>>>>>>>>>>>> construct
> > > > > > > > >>>>>>>>>>>>>>>>>>>> a
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> testing scenario
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> with a backlog of produce requests.
> > > > > > > > >> In
> > > > > > > > >>>>>>>> production,
> > > > > > > > >>>>>>>>>>> other
> > > > > > > > >>>>>>>>>>>>>> than
> > > > > > > > >>>>>>>>>>>>>>>>> the
> > > > > > > > >>>>>>>>>>>>>>>>>>> disk
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> being slow, a backlog of
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> produce requests may also be caused
> > > > > > > > >> by
> > > > > > > > >>>> high
> > > > > > > > >>>>>>>>> produce
> > > > > > > > >>>>>>>>>>> QPS.
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> In that case, we may not want to kill
> > > > > > > > >>> the
> > > > > > > > >>>>>>>> broker
> > > > > > > > >>>>>>>>> and
> > > > > > > > >>>>>>>>>>>>> that's
> > > > > > > > >>>>>>>>>>>>>>> when
> > > > > > > > >>>>>>>>>>>>>>>>>> this
> > > > > > > > >>>>>>>>>>>>>>>>>>>> KIP
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> can be useful, both for JBOD
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> and non-JBOD setup.
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> Going back to your previous question
> > > > > > > > >>>> about
> > > > > > > > >>>>>> each
> > > > > > > > >>>>>>>>>>>>>> ProduceRequest
> > > > > > > > >>>>>>>>>>>>>>>>>>> covering
> > > > > > > > >>>>>>>>>>>>>>>>>>>>> 20
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> partitions that are randomly
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> distributed, let's say a LeaderAndIsr
> > > > > > > > >>>>> request
> > > > > > > > >>>>>>>> is
> > > > > > > > >>>>>>>>>>>> enqueued
> > > > > > > > >>>>>>>>>>>>>> that
> > > > > > > > >>>>>>>>>>>>>>>>>> tries
> > > > > > > > >>>>>>>>>>>>>>>>>>> to
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> switch the current broker, say
> > > > > > > > >> broker0,
> > > > > > > > >>>>> from
> > > > > > > > >>>>>>>>> leader
> > > > > > > > >>>>>>>>>> to
> > > > > > > > >>>>>>>>>>>>>>> follower
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> *for one of the partitions*, say
> > > > > > > > >>>> *test-0*.
> > > > > > > > >>>>>> For
> > > > > > > > >>>>>>>> the
> > > > > > > > >>>>>>>>>>> sake
> > > > > > > > >>>>>>>>>>>> of
> > > > > > > > >>>>>>>>>>>>>>>>>> argument,
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> let's also assume the other brokers,
> > > > > > > > >>> say
> > > > > > > > >>>>>>>> broker1,
> > > > > > > > >>>>>>>>>> have
> > > > > > > > >>>>>>>>>>>>>>> *stopped*
> > > > > > > > >>>>>>>>>>>>>>>>>>>> fetching
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> from
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> the current broker, i.e. broker0.
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> 1. If the enqueued produce requests
> > > > > > > > >>> have
> > > > > > > > >>>>>> acks =
> > > > > > > > >>>>>>>>> -1
> > > > > > > > >>>>>>>>>>>> (ALL)
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>>  1.1 without this KIP, the
> > > > > > > > >>>> ProduceRequests
> > > > > > > > >>>>>>>> ahead
> > > > > > > > >>>>>>>>> of
> > > > > > > > >>>>>>>>>>>>>>>>> LeaderAndISR
> > > > > > > > >>>>>>>>>>>>>>>>>>> will
> > > > > > > > >>>>>>>>>>>>>>>>>>>> be
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> put into the purgatory,
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>>        and since they'll never be
> > > > > > > > >>>>> replicated
> > > > > > > > >>>>>>>> to
> > > > > > > > >>>>>>>>>> other
> > > > > > > > >>>>>>>>>>>>>> brokers
> > > > > > > > >>>>>>>>>>>>>>>>>>> (because
> > > > > > > > >>>>>>>>>>>>>>>>>>>>> of
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> the assumption made above), they will
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>>        be completed either when the
> > > > > > > > >>>>>>>> LeaderAndISR
> > > > > > > > >>>>>>>>>>>> request
> > > > > > > > >>>>>>>>>>>>> is
> > > > > > > > >>>>>>>>>>>>>>>>>>> processed
> > > > > > > > >>>>>>>>>>>>>>>>>>>> or
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> when the timeout happens.
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>>  1.2 With this KIP, broker0 will
> > > > > > > > >>>>> immediately
> > > > > > > > >>>>>>>>>>> transition
> > > > > > > > >>>>>>>>>>>>> the
> > > > > > > > >>>>>>>>>>>>>>>>>>> partition
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> test-0 to become a follower,
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>>        after the current broker sees
> > > > > > > > >>> the
> > > > > > > > >>>>>>>>>> replication
> > > > > > > > >>>>>>>>>>> of
> > > > > > > > >>>>>>>>>>>>> the
> > > > > > > > >>>>>>>>>>>>>>>>>>> remaining
> > > > > > > > >>>>>>>>>>>>>>>>>>>> 19
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> partitions, it can send a response
> > > > > > > > >>>>> indicating
> > > > > > > > >>>>>>>> that
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>>        it's no longer the leader for
> > > > > > > > >>> the
> > > > > > > > >>>>>>>>> "test-0".
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>>  To see the latency difference
> > > > > > > > >> between
> > > > > > > > >>>> 1.1
> > > > > > > > >>>>>> and
> > > > > > > > >>>>>>>>> 1.2,
> > > > > > > > >>>>>>>>>>>> let's
> > > > > > > > >>>>>>>>>>>>>> say
> > > > > > > > >>>>>>>>>>>>>>>>>> there
> > > > > > > > >>>>>>>>>>>>>>>>>>>> are
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> 24K produce requests ahead of the
> > > > > > > > >>>>>> LeaderAndISR,
> > > > > > > > >>>>>>>>> and
> > > > > > > > >>>>>>>>>>>> there
> > > > > > > > >>>>>>>>>>>>>> are
> > > > > > > > >>>>>>>>>>>>>>> 8
> > > > > > > > >>>>>>>>>>>>>>>>> io
> > > > > > > > >>>>>>>>>>>>>>>>>>>>> threads,
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>>  so each io thread will process
> > > > > > > > >>>>>> approximately
> > > > > > > > >>>>>>>>> 3000
> > > > > > > > >>>>>>>>>>>>> produce
> > > > > > > > >>>>>>>>>>>>>>>>>> requests.
> > > > > > > > >>>>>>>>>>>>>>>>>>>> Now
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> let's investigate the io thread that
> > > > > > > > >>>>> finally
> > > > > > > > >>>>>>>>>> processed
> > > > > > > > >>>>>>>>>>>> the
> > > > > > > > >>>>>>>>>>>>>>>>>>>> LeaderAndISR.
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>>  For the 3000 produce requests, if
> > > > > > > > >> we
> > > > > > > > >>>>> model
> > > > > > > > >>>>>>>> the
> > > > > > > > >>>>>>>>>> time
> > > > > > > > >>>>>>>>>>>> when
> > > > > > > > >>>>>>>>>>>>>>> their
> > > > > > > > >>>>>>>>>>>>>>>>>>>>> remaining
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> 19 partitions catch up as t0, t1,
> > > > > > > > >>>> ...t2999,
> > > > > > > > >>>>>> and
> > > > > > > > >>>>>>>>> the
> > > > > > > > >>>>>>>>>>>>>>> LeaderAndISR
> > > > > > > > >>>>>>>>>>>>>>>>>>>> request
> > > > > > > > >>>>>>>>>>>>>>>>>>>>> is
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> processed at time t3000.
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>>  Without this KIP, the 1st produce
> > > > > > > > >>>> request
> > > > > > > > >>>>>>>> would
> > > > > > > > >>>>>>>>>> have
> > > > > > > > >>>>>>>>>>>>>> waited
> > > > > > > > >>>>>>>>>>>>>>> an
> > > > > > > > >>>>>>>>>>>>>>>>>>> extra
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> t3000 - t0 time in the purgatory, the
> > > > > > > > >>> 2nd
> > > > > > > > >>>>> an
> > > > > > > > >>>>>>>> extra
> > > > > > > > >>>>>>>>>>> time
> > > > > > > > >>>>>>>>>>>> of
> > > > > > > > >>>>>>>>>>>>>>>>> t3000 -
> > > > > > > > >>>>>>>>>>>>>>>>>>> t1,
> > > > > > > > >>>>>>>>>>>>>>>>>>>>> etc.
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>>  Roughly speaking, the latency
> > > > > > > > >>>> difference
> > > > > > > > >>>>> is
> > > > > > > > >>>>>>>>> bigger
> > > > > > > > >>>>>>>>>>> for
> > > > > > > > >>>>>>>>>>>>> the
> > > > > > > > >>>>>>>>>>>>>>>>>> earlier
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> produce requests than for the later
> > > > > > > > >>> ones.
> > > > > > > > >>>>> For
> > > > > > > > >>>>>>>> the
> > > > > > > > >>>>>>>>>> same
> > > > > > > > >>>>>>>>>>>>>> reason,
> > > > > > > > >>>>>>>>>>>>>>>>> the
> > > > > > > > >>>>>>>>>>>>>>>>>>> more
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> ProduceRequests queued
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>>  before the LeaderAndISR, the bigger
> > > > > > > > >>>>> benefit
> > > > > > > > >>>>>>>> we
> > > > > > > > >>>>>>>>> get
> > > > > > > > >>>>>>>>>>>>> (capped
> > > > > > > > >>>>>>>>>>>>>>> by
> > > > > > > > >>>>>>>>>>>>>>>>> the
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> produce timeout).
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> 2. If the enqueued produce requests
> > > > > > > > >>> have
> > > > > > > > >>>>>>>> acks=0 or
> > > > > > > > >>>>>>>>>>>> acks=1
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>>  There will be no latency
> > > > > > > > >> differences
> > > > > > > > >>> in
> > > > > > > > >>>>>> this
> > > > > > > > >>>>>>>>> case,
> > > > > > > > >>>>>>>>>>> but
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>>  2.1 without this KIP, the records
> > > > > > > > >> of
> > > > > > > > >>>>>>>> partition
> > > > > > > > >>>>>>>>>>> test-0
> > > > > > > > >>>>>>>>>>>> in
> > > > > > > > >>>>>>>>>>>>>> the
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> ProduceRequests ahead of the
> > > > > > > > >>> LeaderAndISR
> > > > > > > > >>>>>> will
> > > > > > > > >>>>>>>> be
> > > > > > > > >>>>>>>>>>>> appended
> > > > > > > > >>>>>>>>>>>>>> to
> > > > > > > > >>>>>>>>>>>>>>>>> the
> > > > > > > > >>>>>>>>>>>>>>>>>>> local
> > > > > > > > >>>>>>>>>>>>>>>>>>>>> log,
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>>        and eventually be truncated
> > > > > > > > >>> after
> > > > > > > > >>>>>>>>> processing
> > > > > > > > >>>>>>>>>>> the
> > > > > > > > >>>>>>>>>>>>>>>>>>> LeaderAndISR.
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> This is what's referred to as
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>>        "some unofficial definition
> > > > > > > > >> of
> > > > > > > > >>>> data
> > > > > > > > >>>>>>>> loss
> > > > > > > > >>>>>>>>> in
> > > > > > > > >>>>>>>>>>>> terms
> > > > > > > > >>>>>>>>>>>>> of
> > > > > > > > >>>>>>>>>>>>>>>>>> messages
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> beyond the high watermark".
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>>  2.2 with this KIP, we can mitigate
> > > > > > > > >>> the
> > > > > > > > >>>>>> effect
> > > > > > > > >>>>>>>>>> since
> > > > > > > > >>>>>>>>>>> if
> > > > > > > > >>>>>>>>>>>>> the
> > > > > > > > >>>>>>>>>>>>>>>>>>>> LeaderAndISR
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> is immediately processed, the
> > > > > > > > >> response
> > > > > > > > >>> to
> > > > > > > > >>>>>>>>> producers
> > > > > > > > >>>>>>>>>>> will
> > > > > > > > >>>>>>>>>>>>>> have
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>>        the NotLeaderForPartition
> > > > > > > > >>> error,
> > > > > > > > >>>>>>>> causing
> > > > > > > > >>>>>>>>>>>> producers
> > > > > > > > >>>>>>>>>>>>>> to
> > > > > > > > >>>>>>>>>>>>>>>>> retry
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>>
> > > > > > > > >>>>>>>>>>>>>>>>>>>>>> This explanation above is the benefit
> > > > > > > > >>> for
> > > > > > > > >>>>>>>> reducing
> >
>
>
> --
> -Regards,
> Mayuresh R. Gharat
> (862) 250-7125
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message