kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guozhang Wang <wangg...@gmail.com>
Subject Re: [DISCUSSION] KIP-266: Add TimeoutException to KafkaConsumer#position()
Date Mon, 02 Apr 2018 00:35:53 GMT
Hi Richard,

Regarding the streams side changes, we plan to incorporate with the new
APIs once the KIP is done, which is only internal code changes and hence do
not need to include in the KIP.

Could you update the KIP because it has been quite obsoleted from the
discussed topics, and I'm a bit loosing track on what is your final
proposal right now. For example, I'm not completely following your "compromise
of sorts": are you suggesting that we still add overloading functions and
add a config that will be applied to all overload functions without the
timeout, while for other overloaded functions with the timeout value the
config will be ignored?


Guozhang

On Fri, Mar 30, 2018 at 8:36 PM, Richard Yu <yohan.richard.yu@gmail.com>
wrote:

> On a side note, I have noticed that the several other methods in classes
> such as StoreChangeLogReader in Streams calls position() which causes tests
> to hang. It might be out of the scope of the KIP, but should I also change
> the methods which use position() as a callback to at the very least prevent
> the tests from hanging? This issue might be out of the KIP, but I prefer it
> if we could at least make my PR pass the Jenkins Q&A.
>
> Thanks
>
> On Fri, Mar 30, 2018 at 8:24 PM, Richard Yu <yohan.richard.yu@gmail.com>
> wrote:
>
> > Thanks for the review Becket.
> >
> > About the methods beginningOffsets(), endOffsets(), ...:
> > I took a look through the code of KafkaConsumer, but after looking
> through
> > the offsetsByTimes() method
> > and its callbacks in Fetcher, I think these methods already block for a
> > set period of time. I know that there
> > is a chance that the offsets methods in KafkaConsumer might be like poll
> > (that is one section of the method
> > honors the timeout while another -- updateFetchPositions -- does not).
> > However, I don't think that this is the
> > case with offsetsByTimes since the callbacks that I checked does not seem
> > to hang.
> >
> > The clarity of the exception message is a problem. I thought your
> > suggestion there was reasonable. I included
> > it in the KIP.
> >
> > And on another note, I have noticed that several people has voiced the
> > opinion that adding a config might
> > be advisable in relation to adding an extra parameter. I think that we
> can
> > have a compromise of sorts: some
> > methods in KafkaConsumer are relatively similar -- for example,
> position()
> > and committed() both call
> > updateFetchPositions(). I think that we could use the same config for
> > these method as a default timeout if
> > the user does not provide one. On the other hand, if they wish to specify
> > a longer or shorter blocking time,
> > they have the option of changing the timeout. (I included the config as
> an
> > alternative in the KIP) WDYT?
> >
> > Thanks,
> > Richard
> >
> >
> > On Fri, Mar 30, 2018 at 1:26 AM, Becket Qin <becket.qin@gmail.com>
> wrote:
> >
> >> Glad to see the KIP, Richard. This has been a really long pending issue.
> >>
> >> The original arguments from Jay for using config, such as max.block.ms,
> >> instead of using timeout parameters was that people will always hard
> code
> >> the timeout, and the hard coded timeout is rarely correct because it has
> >> to
> >> consider different scenarios. For example, users may receive timeout
> >> exception when the group coordinator moves. Having a configuration with
> >> some reasonable default value will make users' life easier.
> >>
> >> That said, in practice, it seems more useful to have timeout parameters.
> >> We
> >> have seen some library, using the consumers internally, needs to provide
> >> an
> >> external flexible timeout interface. Also, user can easily hard code a
> >> value to get the same as a config based solution.
> >>
> >> The KIP looks good overall. A few comments:
> >>
> >> 1. There are a few other blocking methods that are not included, e.g.
> >> offsetsForTimes(), beginningOffsets(), endOffsets(). Is there any
> reason?
> >>
> >> 2. I am wondering can we take the KIP as a chance to clean up our
> timeout
> >> exception(s)? More specifically, instead of reusing TimeoutException,
> can
> >> we introduce a new ClientTimeoutException with different causes, e.g.
> >> UnknownTopicOrPartition, RequestTimeout, LeaderNotAvailable, etc.
> >> As of now, the TimeoutException is used in the following three cases:
> >>
> >>    1. TimeoutException is a subclass of ApiException which indicates the
> >>    exception was returned by the broker. The TimeoutException was
> >> initially
> >>    returned by the leaders when replication was not done within the
> >> specified
> >>    timeout in the ProduceRequest. It has an error code of 7, which is
> >> returned
> >>    by the broker.
> >>    2. When we migrate to Java clients, in Errors definition, we extended
> >> it
> >>    to indicate request timeout, i.e. a request was sent but the response
> >> was
> >>    not received before timeout. In this case, the clients did not have a
> >>    return code from the broker.
> >>    3. Later at some point, we started to use the TimeoutException for
> >>    clients method call timeout. It is neither related to any broker
> >> returned
> >>    error code, nor to request timeout on the wire.
> >>
> >> Due to the various interpretations, users can easily be confused. As an
> >> example, when a timeout is thrown with "Failed to refresh metadata in X
> >> ms", it is hard to tell what exactly happened. Since we are changing the
> >> API here, it would be good to avoid introducing more ambiguity and see
> >> whether this can be improved. It would be at least one step forward to
> >> remove the usage of case 3.
> >>
> >> Thanks,
> >>
> >> Jiangjie (Becket) Qin
> >>
> >>
> >>
> >>
> >> On Mon, Mar 26, 2018 at 5:50 PM, Guozhang Wang <wangguoz@gmail.com>
> >> wrote:
> >>
> >> > @Richard: TimeoutException inherits from RetriableException which
> >> inherits
> >> > from ApiException. So users should explicitly try to capture
> >> > RetriableException in their code and handle the exception.
> >> >
> >> > @Isamel, Ewen: I'm trying to push progress forward on this one, are we
> >> now
> >> > on the same page for using function parameters than configs?
> >> >
> >> >
> >> > Guozhang
> >> >
> >> >
> >> > On Fri, Mar 23, 2018 at 4:42 PM, Ismael Juma <ismael@juma.me.uk>
> wrote:
> >> >
> >> > > Hi Ewen,
> >> > >
> >> > > Yeah, I mentioned KAFKA-2391 where some of this was discussed. Jay
> was
> >> > > against having timeouts in the methods at the time. However, as
> Jason
> >> > said
> >> > > offline, we did end up with a timeout parameter in `poll`.
> >> > >
> >> > > Ismael
> >> > >
> >> > > On Fri, Mar 23, 2018 at 4:26 PM, Ewen Cheslack-Postava <
> >> > ewen@confluent.io>
> >> > > wrote:
> >> > >
> >> > > > Regarding the flexibility question, has someone tried to dig up
> the
> >> > > > discussion of the new consumer APIs when they were being written?
> I
> >> > > vaguely
> >> > > > recall these exact questions about using APIs vs configs and
> >> > flexibility
> >> > > vs
> >> > > > bloating the API surface area having already been discussed. (Not
> >> that
> >> > we
> >> > > > shouldn't revisit, just that it might also be a faster way to get
> >> to a
> >> > > full
> >> > > > understanding of the options, concerns, and tradeoffs).
> >> > > >
> >> > > > -Ewen
> >> > > >
> >> > > > On Thu, Mar 22, 2018 at 7:19 AM, Richard Yu <
> >> > yohan.richard.yu@gmail.com>
> >> > > > wrote:
> >> > > >
> >> > > > > I do have one question though: in the current KIP, throwing
> >> > > > > TimeoutException to mark
> >> > > > > that time limit is exceeded is applied to all new methods
> >> introduced
> >> > in
> >> > > > > this proposal.
> >> > > > > However, how would users respond when a TimeoutException (since
> >> it is
> >> > > > > considered
> >> > > > > a RuntimeException)?
> >> > > > >
> >> > > > > Thanks,
> >> > > > > Richard
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > On Mon, Mar 19, 2018 at 6:10 PM, Richard Yu <
> >> > > yohan.richard.yu@gmail.com>
> >> > > > > wrote:
> >> > > > >
> >> > > > > > Hi Ismael,
> >> > > > > >
> >> > > > > > You have a great point. Since most of the methods in this KIP
> >> have
> >> > > > > similar
> >> > > > > > callbacks (position() and committed() both use
> >> > > fetchCommittedOffsets(),
> >> > > > > > and
> >> > > > > > commitSync() is similar to position(), except just updating
> >> > offsets),
> >> > > > the
> >> > > > > > amount of time
> >> > > > > > they block should be also about equal.
> >> > > > > >
> >> > > > > > However, I think that we need to take into account a couple of
> >> > > things.
> >> > > > > For
> >> > > > > > starters,
> >> > > > > > if the new methods were all reliant on one config, there is
> >> > > likelihood
> >> > > > > > that the
> >> > > > > > shortcomings for this approach would be similar to what we
> >> faced if
> >> > > we
> >> > > > > let
> >> > > > > > request.timeout.ms control all method timeouts. In
> comparison,
> >> > > adding
> >> > > > > > overloads
> >> > > > > > does not have this problem.
> >> > > > > >
> >> > > > > > If you have further thoughts, please let me know.
> >> > > > > >
> >> > > > > > Richard
> >> > > > > >
> >> > > > > >
> >> > > > > > On Mon, Mar 19, 2018 at 5:12 PM, Ismael Juma <
> ismael@juma.me.uk
> >> >
> >> > > > wrote:
> >> > > > > >
> >> > > > > >> Hi,
> >> > > > > >>
> >> > > > > >> An option that is not currently covered in the KIP is to
> have a
> >> > > > separate
> >> > > > > >> config max.block.ms, which is similar to the producer config
> >> with
> >> > > the
> >> > > > > >> same
> >> > > > > >> name. This came up during the KAFKA-2391 discussion. I think
> >> it's
> >> > > > clear
> >> > > > > >> that we can't rely on request.timeout.ms, so the decision is
> >> > > between
> >> > > > > >> adding
> >> > > > > >> overloads or adding a new config. People seemed to be leaning
> >> > > towards
> >> > > > > the
> >> > > > > >> latter in KAFKA-2391, but Jason makes a good point that the
> >> > > overloads
> >> > > > > are
> >> > > > > >> more flexible. A couple of questions from me:
> >> > > > > >>
> >> > > > > >> 1. Do we need the additional flexibility?
> >> > > > > >> 2. If we do, do we need it for every blocking method?
> >> > > > > >>
> >> > > > > >> Ismael
> >> > > > > >>
> >> > > > > >> On Mon, Mar 19, 2018 at 5:03 PM, Richard Yu <
> >> > > > yohan.richard.yu@gmail.com
> >> > > > > >
> >> > > > > >> wrote:
> >> > > > > >>
> >> > > > > >> > Hi Guozhang,
> >> > > > > >> >
> >> > > > > >> > I made some clarifications to KIP-266, namely:
> >> > > > > >> > 1. Stated more specifically that commitSync will accept
> user
> >> > > input.
> >> > > > > >> > 2. fetchCommittedOffsets(): Made its role in blocking more
> >> clear
> >> > > to
> >> > > > > the
> >> > > > > >> > reader.
> >> > > > > >> > 3. Sketched what would happen when time limit is exceeded.
> >> > > > > >> >
> >> > > > > >> > These changes should make the KIP easier to understand.
> >> > > > > >> >
> >> > > > > >> > Cheers,
> >> > > > > >> > Richard
> >> > > > > >> >
> >> > > > > >> > On Mon, Mar 19, 2018 at 9:33 AM, Guozhang Wang <
> >> > > wangguoz@gmail.com>
> >> > > > > >> wrote:
> >> > > > > >> >
> >> > > > > >> > > Hi Richard,
> >> > > > > >> > >
> >> > > > > >> > > I made a pass over the KIP again, some more
> clarifications
> >> /
> >> > > > > comments:
> >> > > > > >> > >
> >> > > > > >> > > 1. seek() call itself is not blocking, only the following
> >> > poll()
> >> > > > > call
> >> > > > > >> may
> >> > > > > >> > > be blocking as the actually metadata rq will happen.
> >> > > > > >> > >
> >> > > > > >> > > 2. I saw you did not include Consumer.partitionFor(),
> >> > > > > >> > > Consumer.OffsetAndTimestamp() and Consumer.listTopics()
> in
> >> > your
> >> > > > KIP.
> >> > > > > >> > After
> >> > > > > >> > > a second thought, I think this may be a better idea to
> not
> >> > > tackle
> >> > > > > >> them in
> >> > > > > >> > > the same KIP, and probably we should consider whether we
> >> would
> >> > > > > change
> >> > > > > >> the
> >> > > > > >> > > behavior or not in another discussion. So I agree to not
> >> > include
> >> > > > > them.
> >> > > > > >> > >
> >> > > > > >> > > 3. In your wiki you mentioned "Another change shall be
> >> made to
> >> > > > > >> > > KafkaConsumer#poll(), due to its call to
> >> > updateFetchPositions()
> >> > > > > which
> >> > > > > >> > > blocks indefinitely." This part may a bit obscure to most
> >> > > readers
> >> > > > > >> who's
> >> > > > > >> > not
> >> > > > > >> > > familiar with the KafkaConsumer internals, could you
> please
> >> > add
> >> > > > more
> >> > > > > >> > > elaborations. More specifically, I think the root causes
> of
> >> > the
> >> > > > > public
> >> > > > > >> > APIs
> >> > > > > >> > > mentioned are a bit different while the KIP's explanation
> >> > sounds
> >> > > > > like
> >> > > > > >> > they
> >> > > > > >> > > are due to the same reason:
> >> > > > > >> > >
> >> > > > > >> > > 3.1 fetchCommittedOffsets(): this internal call will
> block
> >> > > forever
> >> > > > > if
> >> > > > > >> the
> >> > > > > >> > > committed offsets cannot be fetched successfully and
> affect
> >> > > > > position()
> >> > > > > >> > and
> >> > > > > >> > > committed(). We need to break out of its internal while
> >> loop.
> >> > > > > >> > > 3.2 position() itself will while loop when offsets cannot
> >> be
> >> > > > > >> retrieved in
> >> > > > > >> > > the underlying async call. We need to break out this
> while
> >> > loop.
> >> > > > > >> > > 3.3 commitSync() passed Long.MAX_VALUE as the timeout
> >> value,
> >> > we
> >> > > > > should
> >> > > > > >> > take
> >> > > > > >> > > the user specified timeouts when applicable.
> >> > > > > >> > >
> >> > > > > >> > >
> >> > > > > >> > >
> >> > > > > >> > > Guozhang
> >> > > > > >> > >
> >> > > > > >> > > On Sat, Mar 17, 2018 at 4:44 PM, Richard Yu <
> >> > > > > >> yohan.richard.yu@gmail.com>
> >> > > > > >> > > wrote:
> >> > > > > >> > >
> >> > > > > >> > > > Actually, what I said above is inaccurate. In
> >> > > > > >> > > > testSeekAndCommitWithBrokerFailures,
> >> > TestUtils.waitUntilTrue
> >> > > > > >> blocks,
> >> > > > > >> > not
> >> > > > > >> > > > seek.
> >> > > > > >> > > > My assumption is that seek did not update correctly. I
> >> will
> >> > be
> >> > > > > >> digging
> >> > > > > >> > > > further into this.
> >> > > > > >> > > >
> >> > > > > >> > > >
> >> > > > > >> > > >
> >> > > > > >> > > > On Sat, Mar 17, 2018 at 4:16 PM, Richard Yu <
> >> > > > > >> > yohan.richard.yu@gmail.com>
> >> > > > > >> > > > wrote:
> >> > > > > >> > > >
> >> > > > > >> > > > > One more thing: when looking through tests, I have
> >> > realized
> >> > > > that
> >> > > > > >> > seek()
> >> > > > > >> > > > > methods can potentially block indefinitely. As you
> well
> >> > > know,
> >> > > > > >> seek()
> >> > > > > >> > is
> >> > > > > >> > > > > called when pollOnce() or position() is active. Thus,
> >> if
> >> > > > > >> position()
> >> > > > > >> > > > blocks
> >> > > > > >> > > > > indefinitely, then so would seek(). Should bounding
> >> seek()
> >> > > > also
> >> > > > > be
> >> > > > > >> > > > included
> >> > > > > >> > > > > in this KIP?
> >> > > > > >> > > > >
> >> > > > > >> > > > > Thanks, Richard
> >> > > > > >> > > > >
> >> > > > > >> > > > > On Sat, Mar 17, 2018 at 1:16 PM, Richard Yu <
> >> > > > > >> > > yohan.richard.yu@gmail.com>
> >> > > > > >> > > > > wrote:
> >> > > > > >> > > > >
> >> > > > > >> > > > >> Thanks for the advice, Jason
> >> > > > > >> > > > >>
> >> > > > > >> > > > >> I have modified KIP-266 to include the java doc for
> >> > > > committed()
> >> > > > > >> and
> >> > > > > >> > > > other
> >> > > > > >> > > > >> blocking methods, and I also
> >> > > > > >> > > > >> mentioned poll() which will also be bounded. Let me
> >> know
> >> > if
> >> > > > > >> there is
> >> > > > > >> > > > >> anything else. :)
> >> > > > > >> > > > >>
> >> > > > > >> > > > >> Sincerely, Richard
> >> > > > > >> > > > >>
> >> > > > > >> > > > >>
> >> > > > > >> > > > >>
> >> > > > > >> > > > >>
> >> > > > > >> > > > >>
> >> > > > > >> > > > >> On Sat, Mar 17, 2018 at 12:00 PM, Jason Gustafson <
> >> > > > > >> > jason@confluent.io
> >> > > > > >> > > >
> >> > > > > >> > > > >> wrote:
> >> > > > > >> > > > >>
> >> > > > > >> > > > >>> Hi Richard,
> >> > > > > >> > > > >>>
> >> > > > > >> > > > >>> Thanks for the updates. I'm really glad you picked
> >> this
> >> > > up.
> >> > > > A
> >> > > > > >> > couple
> >> > > > > >> > > > >>> minor
> >> > > > > >> > > > >>> comments:
> >> > > > > >> > > > >>>
> >> > > > > >> > > > >>> 1. Can you list the full set of new APIs explicitly
> >> in
> >> > the
> >> > > > > KIP?
> >> > > > > >> > > > >>> Currently I
> >> > > > > >> > > > >>> only see the javadoc for `position()`.
> >> > > > > >> > > > >>>
> >> > > > > >> > > > >>> 2. We should consider adding `TimeUnit` to the new
> >> > methods
> >> > > > to
> >> > > > > >> avoid
> >> > > > > >> > > > unit
> >> > > > > >> > > > >>> confusion. I know it's inconsistent with the poll()
> >> API,
> >> > > > but I
> >> > > > > >> > think
> >> > > > > >> > > it
> >> > > > > >> > > > >>> was
> >> > > > > >> > > > >>> probably a mistake not to include it there, so
> better
> >> > not
> >> > > to
> >> > > > > >> double
> >> > > > > >> > > > down
> >> > > > > >> > > > >>> on
> >> > > > > >> > > > >>> that mistake. And note that we do already have
> >> > > `close(long,
> >> > > > > >> > > TimeUnit)`.
> >> > > > > >> > > > >>>
> >> > > > > >> > > > >>> Other than that, I think the current KIP seems
> >> > reasonable.
> >> > > > > >> > > > >>>
> >> > > > > >> > > > >>> Thanks,
> >> > > > > >> > > > >>> Jason
> >> > > > > >> > > > >>>
> >> > > > > >> > > > >>> On Wed, Mar 14, 2018 at 5:00 PM, Richard Yu <
> >> > > > > >> > > > yohan.richard.yu@gmail.com>
> >> > > > > >> > > > >>> wrote:
> >> > > > > >> > > > >>>
> >> > > > > >> > > > >>> > Note to all: I have included bounding
> commitSync()
> >> and
> >> > > > > >> > committed()
> >> > > > > >> > > in
> >> > > > > >> > > > >>> this
> >> > > > > >> > > > >>> > KIP.
> >> > > > > >> > > > >>> >
> >> > > > > >> > > > >>> > On Sun, Mar 11, 2018 at 5:05 PM, Richard Yu <
> >> > > > > >> > > > >>> yohan.richard.yu@gmail.com>
> >> > > > > >> > > > >>> > wrote:
> >> > > > > >> > > > >>> >
> >> > > > > >> > > > >>> > > Hi all,
> >> > > > > >> > > > >>> > >
> >> > > > > >> > > > >>> > > I updated the KIP where overloading position()
> is
> >> > now
> >> > > > the
> >> > > > > >> > favored
> >> > > > > >> > > > >>> > approach.
> >> > > > > >> > > > >>> > > Bounding position() using requestTimeoutMs has
> >> been
> >> > > > listed
> >> > > > > >> as
> >> > > > > >> > > > >>> rejected.
> >> > > > > >> > > > >>> > >
> >> > > > > >> > > > >>> > > Any thoughts?
> >> > > > > >> > > > >>> > >
> >> > > > > >> > > > >>> > > On Tue, Mar 6, 2018 at 6:00 PM, Guozhang Wang <
> >> > > > > >> > > wangguoz@gmail.com>
> >> > > > > >> > > > >>> > wrote:
> >> > > > > >> > > > >>> > >
> >> > > > > >> > > > >>> > >> I agree that adding the overloads is most
> >> flexible.
> >> > > But
> >> > > > > >> going
> >> > > > > >> > > for
> >> > > > > >> > > > >>> that
> >> > > > > >> > > > >>> > >> direction we'd do that for all the blocking
> call
> >> > that
> >> > > > > I've
> >> > > > > >> > > listed
> >> > > > > >> > > > >>> above,
> >> > > > > >> > > > >>> > >> with this timeout value covering the
> end-to-end
> >> > > waiting
> >> > > > > >> time.
> >> > > > > >> > > > >>> > >>
> >> > > > > >> > > > >>> > >>
> >> > > > > >> > > > >>> > >> Guozhang
> >> > > > > >> > > > >>> > >>
> >> > > > > >> > > > >>> > >> On Tue, Mar 6, 2018 at 10:02 AM, Ted Yu <
> >> > > > > >> yuzhihong@gmail.com>
> >> > > > > >> > > > >>> wrote:
> >> > > > > >> > > > >>> > >>
> >> > > > > >> > > > >>> > >> > bq. The most flexible option is to add
> >> overloads
> >> > to
> >> > > > the
> >> > > > > >> > > consumer
> >> > > > > >> > > > >>> > >> >
> >> > > > > >> > > > >>> > >> > This option is flexible.
> >> > > > > >> > > > >>> > >> >
> >> > > > > >> > > > >>> > >> > Looking at the tail of SPARK-18057, Spark
> dev
> >> > > voiced
> >> > > > > the
> >> > > > > >> > same
> >> > > > > >> > > > >>> choice.
> >> > > > > >> > > > >>> > >> >
> >> > > > > >> > > > >>> > >> > +1 for adding overload with timeout
> parameter.
> >> > > > > >> > > > >>> > >> >
> >> > > > > >> > > > >>> > >> > Cheers
> >> > > > > >> > > > >>> > >> >
> >> > > > > >> > > > >>> > >> > On Mon, Mar 5, 2018 at 2:42 PM, Jason
> >> Gustafson <
> >> > > > > >> > > > >>> jason@confluent.io>
> >> > > > > >> > > > >>> > >> > wrote:
> >> > > > > >> > > > >>> > >> >
> >> > > > > >> > > > >>> > >> > > @Guozhang I probably have suggested all
> >> options
> >> > > at
> >> > > > > some
> >> > > > > >> > > point
> >> > > > > >> > > > or
> >> > > > > >> > > > >>> > >> another,
> >> > > > > >> > > > >>> > >> > > including most recently, the current KIP!
> I
> >> was
> >> > > > > >> thinking
> >> > > > > >> > > that
> >> > > > > >> > > > >>> > >> practically
> >> > > > > >> > > > >>> > >> > > speaking, the request timeout defines how
> >> long
> >> > > the
> >> > > > > >> user is
> >> > > > > >> > > > >>> willing
> >> > > > > >> > > > >>> > to
> >> > > > > >> > > > >>> > >> > wait
> >> > > > > >> > > > >>> > >> > > for a response. The consumer doesn't
> really
> >> > have
> >> > > a
> >> > > > > >> complex
> >> > > > > >> > > > send
> >> > > > > >> > > > >>> > >> process
> >> > > > > >> > > > >>> > >> > > like the producer for any of these APIs,
> so
> >> I
> >> > > > wasn't
> >> > > > > >> sure
> >> > > > > >> > > how
> >> > > > > >> > > > >>> much
> >> > > > > >> > > > >>> > >> > benefit
> >> > > > > >> > > > >>> > >> > > there would be from having more granular
> >> > control
> >> > > > over
> >> > > > > >> > > timeouts
> >> > > > > >> > > > >>> (in
> >> > > > > >> > > > >>> > the
> >> > > > > >> > > > >>> > >> > end,
> >> > > > > >> > > > >>> > >> > > KIP-91 just adds a single timeout to
> control
> >> > the
> >> > > > > whole
> >> > > > > >> > > send).
> >> > > > > >> > > > >>> That
> >> > > > > >> > > > >>> > >> said,
> >> > > > > >> > > > >>> > >> > it
> >> > > > > >> > > > >>> > >> > > might indeed be better to avoid
> overloading
> >> the
> >> > > > > config
> >> > > > > >> as
> >> > > > > >> > > you
> >> > > > > >> > > > >>> > suggest
> >> > > > > >> > > > >>> > >> > since
> >> > > > > >> > > > >>> > >> > > at least it avoids inconsistency with the
> >> > > > producer's
> >> > > > > >> > usage.
> >> > > > > >> > > > >>> > >> > >
> >> > > > > >> > > > >>> > >> > > The most flexible option is to add
> >> overloads to
> >> > > the
> >> > > > > >> > consumer
> >> > > > > >> > > > so
> >> > > > > >> > > > >>> that
> >> > > > > >> > > > >>> > >> > users
> >> > > > > >> > > > >>> > >> > > can pass the timeout directly. I'm not
> sure
> >> if
> >> > > that
> >> > > > > is
> >> > > > > >> > more
> >> > > > > >> > > or
> >> > > > > >> > > > >>> less
> >> > > > > >> > > > >>> > >> > > annoying than a new config, but I've found
> >> > config
> >> > > > > >> > timeouts a
> >> > > > > >> > > > >>> little
> >> > > > > >> > > > >>> > >> > > constraining in practice. For example, I
> >> could
> >> > > > > imagine
> >> > > > > >> > users
> >> > > > > >> > > > >>> wanting
> >> > > > > >> > > > >>> > >> to
> >> > > > > >> > > > >>> > >> > > wait longer for an offset commit operation
> >> > than a
> >> > > > > >> position
> >> > > > > >> > > > >>> lookup;
> >> > > > > >> > > > >>> > if
> >> > > > > >> > > > >>> > >> the
> >> > > > > >> > > > >>> > >> > > latter isn't timely, users can just pause
> >> the
> >> > > > > partition
> >> > > > > >> > and
> >> > > > > >> > > > >>> continue
> >> > > > > >> > > > >>> > >> > > fetching on others. If you cannot commit
> >> > offsets,
> >> > > > > >> however,
> >> > > > > >> > > it
> >> > > > > >> > > > >>> might
> >> > > > > >> > > > >>> > be
> >> > > > > >> > > > >>> > >> > > safer for an application to wait
> >> availability
> >> > of
> >> > > > the
> >> > > > > >> > > > coordinator
> >> > > > > >> > > > >>> > than
> >> > > > > >> > > > >>> > >> > > continuing.
> >> > > > > >> > > > >>> > >> > >
> >> > > > > >> > > > >>> > >> > > -Jason
> >> > > > > >> > > > >>> > >> > >
> >> > > > > >> > > > >>> > >> > > On Sun, Mar 4, 2018 at 10:14 PM, Guozhang
> >> Wang
> >> > <
> >> > > > > >> > > > >>> wangguoz@gmail.com>
> >> > > > > >> > > > >>> > >> > wrote:
> >> > > > > >> > > > >>> > >> > >
> >> > > > > >> > > > >>> > >> > > > Hello Richard,
> >> > > > > >> > > > >>> > >> > > >
> >> > > > > >> > > > >>> > >> > > > Thanks for the proposed KIP. I have a
> >> couple
> >> > of
> >> > > > > >> general
> >> > > > > >> > > > >>> comments:
> >> > > > > >> > > > >>> > >> > > >
> >> > > > > >> > > > >>> > >> > > > 1. I'm not sure if piggy-backing the
> >> timeout
> >> > > > > >> exception
> >> > > > > >> > on
> >> > > > > >> > > > the
> >> > > > > >> > > > >>> > >> > > > existing requestTimeoutMs configured in
> "
> >> > > > > >> > > request.timeout.ms
> >> > > > > >> > > > "
> >> > > > > >> > > > >>> is a
> >> > > > > >> > > > >>> > >> good
> >> > > > > >> > > > >>> > >> > > > idea
> >> > > > > >> > > > >>> > >> > > > since a) it is a general config that
> >> applies
> >> > > for
> >> > > > > all
> >> > > > > >> > types
> >> > > > > >> > > > of
> >> > > > > >> > > > >>> > >> requests,
> >> > > > > >> > > > >>> > >> > > and
> >> > > > > >> > > > >>> > >> > > > 2) using it to cover all the phases of
> an
> >> API
> >> > > > call,
> >> > > > > >> > > > including
> >> > > > > >> > > > >>> > >> network
> >> > > > > >> > > > >>> > >> > > round
> >> > > > > >> > > > >>> > >> > > > trip and potential metadata refresh is
> >> shown
> >> > to
> >> > > > not
> >> > > > > >> be a
> >> > > > > >> > > > good
> >> > > > > >> > > > >>> > idea,
> >> > > > > >> > > > >>> > >> as
> >> > > > > >> > > > >>> > >> > > > illustrated in KIP-91:
> >> > > > > >> > > > >>> > >> > > >
> >> > > > > >> > > > >>> > >> > > > https://cwiki.apache.org/confl
> >> > > > > >> uence/display/KAFKA/KIP-
> >> > > > > >> > > > >>> > >> > > > 91+Provide+Intuitive+User+
> >> > > > Timeouts+in+The+Producer
> >> > > > > >> > > > >>> > >> > > >
> >> > > > > >> > > > >>> > >> > > > In fact, I think in KAFKA-4879 which is
> >> aimed
> >> > > for
> >> > > > > the
> >> > > > > >> > same
> >> > > > > >> > > > >>> issue
> >> > > > > >> > > > >>> > as
> >> > > > > >> > > > >>> > >> > > > KAFKA-6608,
> >> > > > > >> > > > >>> > >> > > > Jason has suggested we use a new config
> >> for
> >> > the
> >> > > > > API.
> >> > > > > >> > Maybe
> >> > > > > >> > > > >>> this
> >> > > > > >> > > > >>> > >> would
> >> > > > > >> > > > >>> > >> > be
> >> > > > > >> > > > >>> > >> > > a
> >> > > > > >> > > > >>> > >> > > > more intuitive manner than reusing the
> >> > > > > >> > request.timeout.ms
> >> > > > > >> > > > >>> config.
> >> > > > > >> > > > >>> > >> > > >
> >> > > > > >> > > > >>> > >> > > >
> >> > > > > >> > > > >>> > >> > > > 2. Besides the Consumer.position() call,
> >> > there
> >> > > > are
> >> > > > > a
> >> > > > > >> > > couple
> >> > > > > >> > > > of
> >> > > > > >> > > > >>> > more
> >> > > > > >> > > > >>> > >> > > > blocking calls today that could result
> in
> >> > > > infinite
> >> > > > > >> > > blocking:
> >> > > > > >> > > > >>> > >> > > > Consumer.commitSync() and
> >> > Consumer.committed(),
> >> > > > > >> should
> >> > > > > >> > > they
> >> > > > > >> > > > be
> >> > > > > >> > > > >>> > >> > considered
> >> > > > > >> > > > >>> > >> > > > in this KIP as well?
> >> > > > > >> > > > >>> > >> > > >
> >> > > > > >> > > > >>> > >> > > > 3. There are a few other APIs that are
> >> today
> >> > > > > relying
> >> > > > > >> on
> >> > > > > >> > > > >>> > >> > > request.timeout.ms
> >> > > > > >> > > > >>> > >> > > > already for breaking the infinite
> >> blocking,
> >> > > > namely
> >> > > > > >> > > > >>> > >> > > Consumer.partitionFor(),
> >> > > > > >> > > > >>> > >> > > > Consumer.OffsetAndTimestamp() and
> >> > > > > >> Consumer.listTopics(),
> >> > > > > >> > > if
> >> > > > > >> > > > >>> we are
> >> > > > > >> > > > >>> > >> > making
> >> > > > > >> > > > >>> > >> > > > the other blocking calls to be relying a
> >> new
> >> > > > config
> >> > > > > >> as
> >> > > > > >> > > > >>> suggested
> >> > > > > >> > > > >>> > in
> >> > > > > >> > > > >>> > >> 1)
> >> > > > > >> > > > >>> > >> > > > above, should we also change the
> >> semantics of
> >> > > > these
> >> > > > > >> API
> >> > > > > >> > > > >>> functions
> >> > > > > >> > > > >>> > >> for
> >> > > > > >> > > > >>> > >> > > > consistency?
> >> > > > > >> > > > >>> > >> > > >
> >> > > > > >> > > > >>> > >> > > >
> >> > > > > >> > > > >>> > >> > > > Guozhang
> >> > > > > >> > > > >>> > >> > > >
> >> > > > > >> > > > >>> > >> > > >
> >> > > > > >> > > > >>> > >> > > >
> >> > > > > >> > > > >>> > >> > > >
> >> > > > > >> > > > >>> > >> > > > On Sun, Mar 4, 2018 at 11:13 AM, Richard
> >> Yu <
> >> > > > > >> > > > >>> > >> > yohan.richard.yu@gmail.com>
> >> > > > > >> > > > >>> > >> > > > wrote:
> >> > > > > >> > > > >>> > >> > > >
> >> > > > > >> > > > >>> > >> > > > > Hi all,
> >> > > > > >> > > > >>> > >> > > > >
> >> > > > > >> > > > >>> > >> > > > > I would like to discuss a potential
> >> change
> >> > > > which
> >> > > > > >> would
> >> > > > > >> > > be
> >> > > > > >> > > > >>> made
> >> > > > > >> > > > >>> > to
> >> > > > > >> > > > >>> > >> > > > > KafkaConsumer:
> >> > > > > >> > > > >>> > >> > > > > https://cwiki.apache.org/
> >> > > > > confluence/pages/viewpage
> >> > > > > >> .
> >> > > > > >> > > > >>> > >> > > > action?pageId=75974886
> >> > > > > >> > > > >>> > >> > > > >
> >> > > > > >> > > > >>> > >> > > > > Thanks,
> >> > > > > >> > > > >>> > >> > > > > Richard Yu
> >> > > > > >> > > > >>> > >> > > > >
> >> > > > > >> > > > >>> > >> > > >
> >> > > > > >> > > > >>> > >> > > >
> >> > > > > >> > > > >>> > >> > > >
> >> > > > > >> > > > >>> > >> > > > --
> >> > > > > >> > > > >>> > >> > > > -- Guozhang
> >> > > > > >> > > > >>> > >> > > >
> >> > > > > >> > > > >>> > >> > >
> >> > > > > >> > > > >>> > >> >
> >> > > > > >> > > > >>> > >>
> >> > > > > >> > > > >>> > >>
> >> > > > > >> > > > >>> > >>
> >> > > > > >> > > > >>> > >> --
> >> > > > > >> > > > >>> > >> -- Guozhang
> >> > > > > >> > > > >>> > >>
> >> > > > > >> > > > >>> > >
> >> > > > > >> > > > >>> > >
> >> > > > > >> > > > >>> >
> >> > > > > >> > > > >>>
> >> > > > > >> > > > >>
> >> > > > > >> > > > >>
> >> > > > > >> > > > >
> >> > > > > >> > > >
> >> > > > > >> > >
> >> > > > > >> > >
> >> > > > > >> > >
> >> > > > > >> > > --
> >> > > > > >> > > -- Guozhang
> >> > > > > >> > >
> >> > > > > >> >
> >> > > > > >>
> >> > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >> >
> >> >
> >> > --
> >> > -- Guozhang
> >> >
> >>
> >
> >
>



-- 
-- Guozhang

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message