kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Matt Farmer <m...@frmr.me>
Subject Re: [DISCUSS] KIP-210: Provide for custom error handling when Kafka Streams fails to produce
Date Fri, 01 Dec 2017 19:43:38 GMT
Bump! It's been three days here and I haven't seen any further feedback.
Eager to get this resolved, approved, and merged. =)

On Tue, Nov 28, 2017 at 9:53 AM Matt Farmer <matt@frmr.me> wrote:

> Hi there, sorry for the delay in responding. Last week had a holiday and
> several busy work days in it so I'm just now getting around to responding.
>
> > We would only exclude
> > exception Streams can handle itself (like ProducerFencedException) --
> > thus, if the handler has code to react to this, it would not be bad, as
> > this code is just never called.
> [...]
> > Thus, I am still in favor of not calling the ProductionExceptionHandler
> > for fatal exception.
>
> Acknowledged, so is ProducerFencedException the only kind of exception I
> need to change my behavior on? Or are there other types I need to check? Is
> there a comprehensive list somewhere?
>
> > About the "always continue" case. Sounds good to me to remove it (not
> > sure why we need it in test package?)
>
> I include it in the test package because I have tests that assert that if
> the record collector impl encounters an Exception and receives a CONTINUE
> that it actually does CONTINUE.
>
> > What is there reasoning for invoking the handler only for the first
> > exception?
>
> I didn't want to invoke the handler in places where the CONTINUE or FAIL
> result would just be ignored. Presumably, after a FAIL has been returned,
> subsequent exceptions are likely to be repeats or noise from my
> understanding of the code paths. If this is incorrect we can revisit.
>
> Once I get the answers to these questions I can make another pass on the
> pull request!
>
> Matt
>
> On Mon, Nov 20, 2017 at 4:07 PM Matthias J. Sax <matthias@confluent.io>
> wrote:
>
>> Thanks for following up!
>>
>> One thought about an older reply from you:
>>
>> >>>> I strongly disagree here. The purpose of this handler isn't *just* to
>> >>>> make a decision for streams. There may also be desirable side
>> effects that
>> >>>> users wish to cause when production exceptions occur. There may be
>> >>>> side-effects that they wish to cause when AuthenticationExceptions
>> occur,
>> >>>> as well. We should still give them the hooks to preform those side
>> effects.
>>
>> And your follow up:
>>
>> >>    - I think I would rather invoke it for all exceptions that could
>> occur
>> >>    from attempting to produce - even those exceptions were returning
>> CONTINUE
>> >>    may not be a good idea (e.g. Authorization exception). Until there
>> is a
>> >>    different type for exceptions that are totally fatal (for example a
>> >>    FatalStreamsException or some sort), maintaining a list of
>> exceptions that
>> >>    you can intercept with this handler and exceptions you cannot would
>> be
>> >>    really error-prone and hard to keep correct.
>>
>> I understand what you are saying, however, consider that Streams needs
>> to die for a fatal exception. Thus, if you call the handler for a fatal
>> exception, we would  need to ignore the return value and fail -- this
>> seems to be rather intuitive. Furthermore, users can register an
>> uncaught-exception-handler and side effects for fatal errors can be
>> triggered there.
>>
>> Btw: an AuthorizationException is fatal -- not sure what you mean by an
>> "totally fatal" exception -- there is no superlative to fatal from my
>> understanding.
>>
>> About maintaining a list of exceptions: I don't think this is too hard,
>> and users also don't need to worry about it IMHO. We would only exclude
>> exception Streams can handle itself (like ProducerFencedException) --
>> thus, if the handler has code to react to this, it would not be bad, as
>> this code is just never called. In case that there is an exception
>> Streams could actually handle but we still call the handler (ie, bug),
>> there should be no harm either -- the handler needs to return either
>> CONTINUE or FAIL and we would obey. It could only happen, that Streams
>> dies---as request by the user(!)---even if Streams could actually handle
>> the error and move on. But this seems to be not a issue from my point of
>> view.
>>
>> Thus, I am still in favor of not calling the ProductionExceptionHandler
>> for fatal exception.
>>
>>
>>
>> About the "always continue" case. Sounds good to me to remove it (not
>> sure why we need it in test package?) and to rename the "failing"
>> handler to "Default" (even if "default" is less descriptive and I would
>> still prefer "Fail" in the name).
>>
>>
>> Last question:
>>
>> >>       - Continue to *only* invoke it on the first exception that we
>> >>       encounter (before sendException is set)
>>
>>
>> What is there reasoning for invoking the handler only for the first
>> exception?
>>
>>
>>
>>
>> -Matthias
>>
>> On 11/20/17 10:43 AM, Matt Farmer wrote:
>> > Alright, here are some updates I'm planning to make after thinking on
>> this
>> > for awhile:
>> >
>> >    - Given that the "always continue" handler isn't something I'd
>> recommend
>> >    for production use as is, I'm going to move it into the test
>> namespace and
>> >    remove it from mention in the public API.
>> >    - I'm going to rename the "AlwaysFailProductionExceptionHandler" to
>> >    "DefaultProductionExceptionHandler"
>> >    - Given that the API for the exception handler involves being
>> invoked by
>> >    streams to make a decision about whether or not to continue — I
>> think that
>> >    we should:
>> >       - Continue to *only* invoke it on the first exception that we
>> >       encounter (before sendException is set)
>> >       - Stop invoking it for the self-healing fenced exceptions.
>> >    - I think I would rather invoke it for all exceptions that could
>> occur
>> >    from attempting to produce - even those exceptions were returning
>> CONTINUE
>> >    may not be a good idea (e.g. Authorization exception). Until there
>> is a
>> >    different type for exceptions that are totally fatal (for example a
>> >    FatalStreamsException or some sort), maintaining a list of
>> exceptions that
>> >    you can intercept with this handler and exceptions you cannot would
>> be
>> >    really error-prone and hard to keep correct.
>> >       - I'm happy to file a KIP for the creation of this new Exception
>> type
>> >       if there is interest.
>> >
>> > @ Matthias — What do you think about the above?
>> >
>> > On Tue, Nov 14, 2017 at 9:44 AM Matt Farmer <matt@frmr.me> wrote:
>> >
>> >> I responded before reading your code review and didn't see the bit
>> about
>> >> how ProducerFencedException is self-healing. This error handling logic
>> is
>> >> *quite* confusing to reason about... I think I'm going to sit down with
>> >> the code a bit more today, but I'm inclined to think that if the fenced
>> >> exceptions are, indeed, self healing that we still invoke the handler
>> but
>> >> ignore its result for those exceptions.
>> >>
>> >> On Tue, Nov 14, 2017 at 9:37 AM Matt Farmer <matt@frmr.me> wrote:
>> >>
>> >>> Hi there,
>> >>>
>> >>> Following up here...
>> >>>
>> >>>> One tiny comment: I would prefer to remove the "Always" from the
>> >>> handler implementation names -- it sounds "cleaner" to me without it.
>> >>>
>> >>> Let me think on this. I generally prefer expressiveness to clean-ness,
>> >>> and I think that comes out in the names I chose for things. The straw
>> man
>> >>> in my head is the person that tried to substitute in the
>> "AlwaysContinue"
>> >>> variant and the "Always" is a trigger to really consider whether or
>> not
>> >>> they *always* want to try to continue.
>> >>>
>> >>> To be truthful, after some thought, using the "AlwaysContinue" variant
>> >>> isn't something I'd recommend generally in a production system.
>> Ideally
>> >>> these handlers are targeted at handling a specific series of
>> exceptions
>> >>> that a user wants to ignore, and not ignoring all exceptions. More on
>> this
>> >>> in a minute.
>> >>>
>> >>>> For the first category, it seems to not make sense to call the
>> handle but
>> >>> Streams should always fail. If we follow this design, the KIP should
>> list
>> >>> all exceptions for which the handler is not called.
>> >>>
>> >>> I strongly disagree here. The purpose of this handler isn't *just* to
>> >>> make a decision for streams. There may also be desirable side effects
>> that
>> >>> users wish to cause when production exceptions occur. There may be
>> >>> side-effects that they wish to cause when AuthenticationExceptions
>> occur,
>> >>> as well. We should still give them the hooks to preform those side
>> effects.
>> >>>
>> >>> In light of the above, I'm thinking that the
>> >>> "AlwaysContinueProductionExceptionHandler" variant should probably be
>> >>> removed from the public API and moved into tests since that's where
>> it's
>> >>> most useful. The more I think on it, the more I feel that having that
>> in
>> >>> the public API is a landmine. If you agree, then, we can rename the
>> >>> "AlwaysFailProductionExceptionHandler" to
>> >>> "DefaultProductionExceptionHandler".
>> >>>
>> >>> Thoughts?
>> >>>
>> >>> On Fri, Nov 10, 2017 at 6:13 PM Matthias J. Sax <
>> matthias@confluent.io>
>> >>> wrote:
>> >>>
>> >>>> I just review the PR, and there is one thing we should discuss.
>> >>>>
>> >>>> There are different types of exceptions that could occur. Some apply
>> to
>> >>>> all records (like Authorization exception) while others are for
>> single
>> >>>> records only (like record too large).
>> >>>>
>> >>>> For the first category, it seems to not make sense to call the handle
>> >>>> but Streams should always fail. If we follow this design, the KIP
>> should
>> >>>> list all exceptions for which the handler is not called.
>> >>>>
>> >>>> WDYT?
>> >>>>
>> >>>>
>> >>>> -Matthias
>> >>>>
>> >>>>
>> >>>> On 11/10/17 2:56 PM, Matthias J. Sax wrote:
>> >>>>> Just catching up on this KIP.
>> >>>>>
>> >>>>> One tiny comment: I would prefer to remove the "Always" from the
>> >>>> handler
>> >>>>> implementation names -- it sounds "cleaner" to me without it.
>> >>>>>
>> >>>>>
>> >>>>> -Matthias
>> >>>>>
>> >>>>> On 11/5/17 12:57 PM, Matt Farmer wrote:
>> >>>>>> It is agreed, then. I've updated the pull request. I'm trying to
>> also
>> >>>>>> update the KIP accordingly, but cwiki is being slow and dropping
>> >>>>>> connections..... I'll try again a bit later but please consider the
>> >>>> KIP
>> >>>>>> updated for all intents and purposes. heh.
>> >>>>>>
>> >>>>>> On Sun, Nov 5, 2017 at 3:45 PM Guozhang Wang <wangguoz@gmail.com>
>> >>>> wrote:
>> >>>>>>
>> >>>>>>> That makes sense.
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> Guozhang
>> >>>>>>>
>> >>>>>>> On Sun, Nov 5, 2017 at 12:33 PM, Matt Farmer <matt@frmr.me>
>> wrote:
>> >>>>>>>
>> >>>>>>>> Interesting. I'm not sure I agree. I've been bitten many times by
>> >>>>>>>> unintentionally shipping code that fails to properly implement
>> >>>> logging. I
>> >>>>>>>> always discover this at the exact *worst* moment, too. (Normally
>> at
>> >>>> 3 AM
>> >>>>>>>> during an on-call shift. Hah.) However, if others feel the same
>> way
>> >>>> I
>> >>>>>>> could
>> >>>>>>>> probably be convinced to remove it.
>> >>>>>>>>
>> >>>>>>>> We could also meet halfway and say that when a customized
>> >>>>>>>> ProductionExceptionHandler instructs Streams to CONTINUE, we log
>> at
>> >>>> DEBUG
>> >>>>>>>> level instead of WARN level. Would that alternative be appealing
>> to
>> >>>> you?
>> >>>>>>>>
>> >>>>>>>> On Sun, Nov 5, 2017 at 12:32 PM Guozhang Wang <
>> wangguoz@gmail.com>
>> >>>>>>> wrote:
>> >>>>>>>>
>> >>>>>>>>> Thanks for the updates. I made a pass over the wiki again and it
>> >>>> looks
>> >>>>>>>>> good.
>> >>>>>>>>>
>> >>>>>>>>> About whether record collector should still internally log the
>> >>>> error in
>> >>>>>>>>> addition to what the customized ProductionExceptionHandler
>> does. I
>> >>>>>>>>> personally would prefer only to log if the returned value is
>> FAIL
>> >>>> to
>> >>>>>>>>> indicate that this thread is going to shutdown and trigger the
>> >>>>>>> exception
>> >>>>>>>>> handler.
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> Guozhang
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> On Sun, Nov 5, 2017 at 6:09 AM, Matt Farmer <matt@frmr.me>
>> wrote:
>> >>>>>>>>>
>> >>>>>>>>>> Hello, a bit later than I'd anticipated, but I've updated this
>> >>>> KIP as
>> >>>>>>>>>> outlined above. The updated KIP is now ready for review again!
>> >>>>>>>>>>
>> >>>>>>>>>> On Sat, Nov 4, 2017 at 1:03 PM Matt Farmer <matt@frmr.me>
>> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>>> Ah. I actually created both of those in the PR and forgot to
>> >>>>>>> mention
>> >>>>>>>>> them
>> >>>>>>>>>>> by name in the KIP! Thanks for pointing out the oversight.
>> >>>>>>>>>>>
>> >>>>>>>>>>> I’ll revise the KIP this afternoon accordingly.
>> >>>>>>>>>>>
>> >>>>>>>>>>> The logging is actually provided for in the record collector.
>> >>>>>>>> Whenever
>> >>>>>>>>> a
>> >>>>>>>>>>> handler continues it’ll log a warning to ensure that it’s
>> >>>>>>>> *impossible*
>> >>>>>>>>> to
>> >>>>>>>>>>> write a handler that totally suppresses production exceptions
>> >>>> from
>> >>>>>>>> the
>> >>>>>>>>>> log.
>> >>>>>>>>>>> As such, the default continue handler just returns the
>> continue
>> >>>>>>>> value.
>> >>>>>>>>> I
>> >>>>>>>>>>> can add details about those semantics to the KIP as well.
>> >>>>>>>>>>> On Sat, Nov 4, 2017 at 12:46 PM Matthias J. Sax <
>> >>>>>>>> matthias@confluent.io
>> >>>>>>>>>>
>> >>>>>>>>>>> wrote:
>> >>>>>>>>>>>
>> >>>>>>>>>>>> One more comment.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> You mention a default implementation for the handler that
>> >>>> fails. I
>> >>>>>>>>>>>> think, this should be part of the public API and thus should
>> >>>> have
>> >>>>>>> a
>> >>>>>>>>>>>> proper defined named that is mentioned in the KIP.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> We could also add a second implementation for the
>> >>>> log-and-move-on
>> >>>>>>>>>>>> strategy, as both are the two most common cases. This class
>> >>>> should
>> >>>>>>>>> also
>> >>>>>>>>>>>> be part of public API (so users can just set in the config)
>> >>>> with a
>> >>>>>>>>>>>> proper name.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Otherwise, I like the KIP a lot! Thanks.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> -Matthias
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> On 11/1/17 12:23 AM, Matt Farmer wrote:
>> >>>>>>>>>>>>> Thanks for the heads up. Yes, I think my changes are
>> compatible
>> >>>>>>>> with
>> >>>>>>>>>>>> that
>> >>>>>>>>>>>>> PR, but there will be a merge conflict that happens whenever
>> >>>> one
>> >>>>>>>> of
>> >>>>>>>>>> the
>> >>>>>>>>>>>> PRs
>> >>>>>>>>>>>>> is merged. Happy to reconcile the changes in my PR if 4148
>> goes
>> >>>>>>> in
>> >>>>>>>>>>>> first. :)
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> On Tue, Oct 31, 2017 at 6:44 PM Guozhang Wang <
>> >>>>>>> wangguoz@gmail.com
>> >>>>>>>>>
>> >>>>>>>>>>>> wrote:
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>> That sounds reasonable, thanks Matt.
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> As for the implementation, please note that there is
>> another
>> >>>>>>>>> ongoing
>> >>>>>>>>>> PR
>> >>>>>>>>>>>>>> that may touch the same classes that you are working on:
>> >>>>>>>>>>>>>> https://github.com/apache/kafka/pull/4148
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> So it may help if you can also take a look at that PR and
>> see
>> >>>>>>> if
>> >>>>>>>> it
>> >>>>>>>>>> is
>> >>>>>>>>>>>>>> compatible with your changes.
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Guozhang
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> On Tue, Oct 31, 2017 at 10:59 AM, Matt Farmer <
>> matt@frmr.me>
>> >>>>>>>>> wrote:
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> I've opened this pull request to implement the KIP as
>> >>>>>>> currently
>> >>>>>>>>>>>> written:
>> >>>>>>>>>>>>>>> https://github.com/apache/kafka/pull/4165. It still needs
>> >>>>>>> some
>> >>>>>>>>>> tests
>> >>>>>>>>>>>>>>> added,
>> >>>>>>>>>>>>>>> but largely represents the shape I was going for.
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> If there are more points that folks would like to discuss,
>> >>>>>>>> please
>> >>>>>>>>>> let
>> >>>>>>>>>>>> me
>> >>>>>>>>>>>>>>> know. If I don't hear anything by tomorrow afternoon I'll
>> >>>>>>>> probably
>> >>>>>>>>>>>> start
>> >>>>>>>>>>>>>> a
>> >>>>>>>>>>>>>>> [VOTE] thread.
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> Thanks,
>> >>>>>>>>>>>>>>> Matt
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> On Fri, Oct 27, 2017 at 7:33 PM Matt Farmer <matt@frmr.me
>> >
>> >>>>>>>> wrote:
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> I can’t think of a reason that would be problematic.
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> Most of the time I would write a handler like this, I
>> either
>> >>>>>>>> want
>> >>>>>>>>>> to
>> >>>>>>>>>>>>>>>> ignore the error or fail and bring everything down so
>> that I
>> >>>>>>>> can
>> >>>>>>>>>> spin
>> >>>>>>>>>>>>>> it
>> >>>>>>>>>>>>>>>> back up later and resume from earlier offsets. When we
>> start
>> >>>>>>> up
>> >>>>>>>>>> after
>> >>>>>>>>>>>>>>>> crashing we’ll eventually try to process the message we
>> >>>>>>> failed
>> >>>>>>>> to
>> >>>>>>>>>>>>>> produce
>> >>>>>>>>>>>>>>>> again.
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> I’m concerned that “putting in a queue for later” opens
>> you
>> >>>>>>> up
>> >>>>>>>> to
>> >>>>>>>>>>>>>> putting
>> >>>>>>>>>>>>>>>> messages into the destination topic in an unexpected
>> order.
>> >>>>>>>>> However
>> >>>>>>>>>>>> if
>> >>>>>>>>>>>>>>>> others feel differently, I’m happy to talk about it.
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> On Fri, Oct 27, 2017 at 7:10 PM Guozhang Wang <
>> >>>>>>>>> wangguoz@gmail.com>
>> >>>>>>>>>>>>>>> wrote:
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> Please correct me if I'm wrong, but my understanding is
>> >>>>>>> that
>> >>>>>>>>> the
>> >>>>>>>>>>>>>>> record
>> >>>>>>>>>>>>>>>>>> metadata is always null if an exception occurred while
>> >>>>>>> trying
>> >>>>>>>>> to
>> >>>>>>>>>>>>>>>>> produce.
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>> That is right. Thanks.
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>> I looked at the example code, and one thing I realized
>> that
>> >>>>>>>>> since
>> >>>>>>>>>> we
>> >>>>>>>>>>>>>> are
>> >>>>>>>>>>>>>>>>> not passing the context in the handle function, we may
>> not
>> >>>>>>> be
>> >>>>>>>>>>>>>> implement
>> >>>>>>>>>>>>>>>>> the
>> >>>>>>>>>>>>>>>>> logic to send the fail records into another queue for
>> >>>> future
>> >>>>>>>>>>>>>> processing.
>> >>>>>>>>>>>>>>>>> Would people think that would be a big issue?
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>> Guozhang
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>> On Thu, Oct 26, 2017 at 12:14 PM, Matt Farmer <
>> >>>> matt@frmr.me
>> >>>>>>>>
>> >>>>>>>>>> wrote:
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> Hello all,
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> I've updated the KIP based on this conversation, and
>> made
>> >>>>>>> it
>> >>>>>>>> so
>> >>>>>>>>>>>> that
>> >>>>>>>>>>>>>>> its
>> >>>>>>>>>>>>>>>>>> interface, config setting, and parameters line up more
>> >>>>>>>> closely
>> >>>>>>>>>> with
>> >>>>>>>>>>>>>>> the
>> >>>>>>>>>>>>>>>>>> interface in KIP-161 (deserialization handler).
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> I believe there are a few specific questions I need to
>> >>>>>>> reply
>> >>>>>>>>>>>> to.....
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>> The question I had about then handle parameters are
>> >>>> around
>> >>>>>>>> the
>> >>>>>>>>>>>>>>> record,
>> >>>>>>>>>>>>>>>>>>> should it be `ProducerRecord<byte[], byte[]>`, or be
>> >>>>>>>> generics
>> >>>>>>>>> of
>> >>>>>>>>>>>>>>>>>>> `ProducerRecord<? extends K, ? extends V>` or
>> >>>>>>>>> `ProducerRecord<?
>> >>>>>>>>>>>>>>>>> extends
>> >>>>>>>>>>>>>>>>>>> Object, ? extends Object>`?
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> At this point in the code we're guaranteed that this
>> is a
>> >>>>>>>>>>>>>>>>>> ProducerRecord<byte[], byte[]>, so the generics would
>> just
>> >>>>>>>> make
>> >>>>>>>>>> it
>> >>>>>>>>>>>>>>>>> harder
>> >>>>>>>>>>>>>>>>>> to work with the key and value.
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>> Also, should the handle function include the
>> >>>>>>>> `RecordMetadata`
>> >>>>>>>>> as
>> >>>>>>>>>>>>>>> well
>> >>>>>>>>>>>>>>>>> in
>> >>>>>>>>>>>>>>>>>>> case it is not null?
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> Please correct me if I'm wrong, but my understanding is
>> >>>>>>> that
>> >>>>>>>>> the
>> >>>>>>>>>>>>>>> record
>> >>>>>>>>>>>>>>>>>> metadata is always null if an exception occurred while
>> >>>>>>> trying
>> >>>>>>>>> to
>> >>>>>>>>>>>>>>>>> produce.
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>> We may probably try to write down at least the
>> following
>> >>>>>>>>>> handling
>> >>>>>>>>>>>>>>>>> logic
>> >>>>>>>>>>>>>>>>>> and
>> >>>>>>>>>>>>>>>>>>> see if the given API is sufficient for it
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> I've added some examples to the KIP. Let me know what
>> you
>> >>>>>>>>> think.
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> Cheers,
>> >>>>>>>>>>>>>>>>>> Matt
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> On Mon, Oct 23, 2017 at 9:00 PM Matt Farmer <
>> matt@frmr.me
>> >>>>>
>> >>>>>>>>>> wrote:
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>> Thanks for this feedback. I’m at a conference right
>> now
>> >>>>>>> and
>> >>>>>>>> am
>> >>>>>>>>>>>>>>>>> planning
>> >>>>>>>>>>>>>>>>>> on
>> >>>>>>>>>>>>>>>>>>> updating the KIP again with details from this
>> >>>> conversation
>> >>>>>>>>> later
>> >>>>>>>>>>>>>>> this
>> >>>>>>>>>>>>>>>>>> week.
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>> I’ll shoot you a more detailed response then! :)
>> >>>>>>>>>>>>>>>>>>> On Mon, Oct 23, 2017 at 8:16 PM Guozhang Wang <
>> >>>>>>>>>> wangguoz@gmail.com
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> wrote:
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> Thanks for the KIP Matt.
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> Regarding the handle interface of
>> >>>>>>>>>> ProductionExceptionHandlerResp
>> >>>>>>>>>>>>>>> onse,
>> >>>>>>>>>>>>>>>>>>>> could
>> >>>>>>>>>>>>>>>>>>>> you write it on the wiki also, along with the actual
>> >>>>>>> added
>> >>>>>>>>>> config
>> >>>>>>>>>>>>>>>>> names
>> >>>>>>>>>>>>>>>>>>>> (e.g. what
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> >>>>>>>> 161%3A+streams+
>> >>>>>>>>>>>>>>>>>> deserialization+exception+handlers
>> >>>>>>>>>>>>>>>>>>>> described).
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> The question I had about then handle parameters are
>> >>>>>>> around
>> >>>>>>>>> the
>> >>>>>>>>>>>>>>>>> record,
>> >>>>>>>>>>>>>>>>>>>> should it be `ProducerRecord<byte[], byte[]>`, or be
>> >>>>>>>> generics
>> >>>>>>>>>> of
>> >>>>>>>>>>>>>>>>>>>> `ProducerRecord<? extends K, ? extends V>` or
>> >>>>>>>>> `ProducerRecord<?
>> >>>>>>>>>>>>>>>>> extends
>> >>>>>>>>>>>>>>>>>>>> Object, ? extends Object>`?
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> Also, should the handle function include the
>> >>>>>>>> `RecordMetadata`
>> >>>>>>>>>> as
>> >>>>>>>>>>>>>>>>> well in
>> >>>>>>>>>>>>>>>>>>>> case it is not null?
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> We may probably try to write down at least the
>> following
>> >>>>>>>>>> handling
>> >>>>>>>>>>>>>>>>> logic
>> >>>>>>>>>>>>>>>>>>>> and
>> >>>>>>>>>>>>>>>>>>>> see if the given API is sufficient for it: 1) throw
>> >>>>>>>> exception
>> >>>>>>>>>>>>>>>>>> immediately
>> >>>>>>>>>>>>>>>>>>>> to fail fast and stop the world, 2) log the error and
>> >>>>>>> drop
>> >>>>>>>>>> record
>> >>>>>>>>>>>>>>> and
>> >>>>>>>>>>>>>>>>>>>> proceed silently, 3) send such errors to a specific
>> >>>>>>> "error"
>> >>>>>>>>>> Kafka
>> >>>>>>>>>>>>>>>>> topic,
>> >>>>>>>>>>>>>>>>>>>> or
>> >>>>>>>>>>>>>>>>>>>> record it as an app-level metrics (
>> >>>>>>>>>>>>>>>>>>>>
>> https://kafka.apache.org/documentation/#kafka_streams_
>> >>>>>>>>>> monitoring
>> >>>>>>>>>>>>>> )
>> >>>>>>>>>>>>>>>>> for
>> >>>>>>>>>>>>>>>>>>>> monitoring.
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> Guozhang
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> On Fri, Oct 20, 2017 at 5:47 PM, Matt Farmer <
>> >>>>>>> matt@frmr.me
>> >>>>>>>>>
>> >>>>>>>>>>>>>> wrote:
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>> I did some more digging tonight.
>> >>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>> @Ted: It looks like the deserialization handler uses
>> >>>>>>>>>>>>>>>>>>>>> "default.deserialization.exception.handler" for the
>> >>>>>>>> config
>> >>>>>>>>>>>>>>> name. No
>> >>>>>>>>>>>>>>>>>>>>> ".class" on the end. I'm inclined to think this
>> should
>> >>>>>>> use
>> >>>>>>>>>>>>>>>>>>>>> "default.production.exception.handler".
>> >>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>> On Fri, Oct 20, 2017 at 8:22 PM Matt Farmer <
>> >>>>>>> matt@frmr.me
>> >>>>>>>>>
>> >>>>>>>>>>>>>>> wrote:
>> >>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>> Okay, I've dug into this a little bit.
>> >>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>> I think getting access to the serialized record is
>> >>>>>>>>> possible,
>> >>>>>>>>>>>>>>> and
>> >>>>>>>>>>>>>>>>>>>> changing
>> >>>>>>>>>>>>>>>>>>>>>> the naming and return type is certainly doable.
>> >>>>>>> However,
>> >>>>>>>>>>>>>>> because
>> >>>>>>>>>>>>>>>>>> we're
>> >>>>>>>>>>>>>>>>>>>>>> hooking into the onCompletion callback we have no
>> >>>>>>>> guarantee
>> >>>>>>>>>>>>>>> that
>> >>>>>>>>>>>>>>>>> the
>> >>>>>>>>>>>>>>>>>>>>>> ProcessorContext state hasn't changed by the time
>> this
>> >>>>>>>>>>>>>>> particular
>> >>>>>>>>>>>>>>>>>>>> handler
>> >>>>>>>>>>>>>>>>>>>>>> runs. So I think the signature would change to
>> >>>>>>> something
>> >>>>>>>>>>>>>> like:
>> >>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>> ProductionExceptionHandlerResponse handle(final
>> >>>>>>>>>>>>>>>>> ProducerRecord<..>
>> >>>>>>>>>>>>>>>>>>>>> record,
>> >>>>>>>>>>>>>>>>>>>>>> final Exception exception)
>> >>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>> Would this be acceptable?
>> >>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>> On Thu, Oct 19, 2017 at 7:33 PM Matt Farmer <
>> >>>>>>>> matt@frmr.me>
>> >>>>>>>>>>>>>>>>> wrote:
>> >>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>> Ah good idea. Hmmm. I can line up the naming and
>> >>>>>>> return
>> >>>>>>>>> type
>> >>>>>>>>>>>>>>> but
>> >>>>>>>>>>>>>>>>>> I’m
>> >>>>>>>>>>>>>>>>>>>> not
>> >>>>>>>>>>>>>>>>>>>>>>> sure if I can get my hands on the context and the
>> >>>>>>> record
>> >>>>>>>>>>>>>>> itself
>> >>>>>>>>>>>>>>>>>>>> without
>> >>>>>>>>>>>>>>>>>>>>>>> other changes.
>> >>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>> Let me dig in and follow up here tomorrow.
>> >>>>>>>>>>>>>>>>>>>>>>> On Thu, Oct 19, 2017 at 7:14 PM Matthias J. Sax <
>> >>>>>>>>>>>>>>>>>>>> matthias@confluent.io>
>> >>>>>>>>>>>>>>>>>>>>>>> wrote:
>> >>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>> Thanks for the KIP.
>> >>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>> Are you familiar with KIP-161?
>> >>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> >>>>>>>> 161%3A+streams+
>> >>>>>>>>>>>>>>>>>>>>> deserialization+exception+handlers
>> >>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>> I thinks, we should align the design (parameter
>> >>>>>>> naming,
>> >>>>>>>>>>>>>>> return
>> >>>>>>>>>>>>>>>>>>>> types,
>> >>>>>>>>>>>>>>>>>>>>>>>> class names etc) of KIP-210 to KIP-161 to get a
>> >>>>>>> unified
>> >>>>>>>>>>>>>> user
>> >>>>>>>>>>>>>>>>>>>>> experience.
>> >>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>> -Matthias
>> >>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>> On 10/18/17 4:20 PM, Matt Farmer wrote:
>> >>>>>>>>>>>>>>>>>>>>>>>>> I’ll create the JIRA ticket.
>> >>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>> I think that config name will work. I’ll update
>> the
>> >>>>>>>> KIP
>> >>>>>>>>>>>>>>>>>>>> accordingly.
>> >>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Oct 18, 2017 at 6:09 PM Ted Yu <
>> >>>>>>>>>>>>>>> yuzhihong@gmail.com>
>> >>>>>>>>>>>>>>>>>>>> wrote:
>> >>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>>> Can you create JIRA that corresponds to the
>> KIP ?
>> >>>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>>> For the new config, how about naming it
>> >>>>>>>>>>>>>>>>>>>>>>>>>> production.exception.processor.class
>> >>>>>>>>>>>>>>>>>>>>>>>>>> ? This way it is clear that class name should
>> be
>> >>>>>>>>>>>>>>> specified.
>> >>>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>>> Cheers
>> >>>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Oct 18, 2017 at 2:40 PM, Matt Farmer <
>> >>>>>>>>>>>>>>> matt@frmr.me>
>> >>>>>>>>>>>>>>>>>>>> wrote:
>> >>>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Hello everyone,
>> >>>>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>>>> This is the discussion thread for the KIP
>> that I
>> >>>>>>>> just
>> >>>>>>>>>>>>>>> filed
>> >>>>>>>>>>>>>>>>>>>> here:
>> >>>>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> >>>>>>>>>>>>>>>>>>>>>>>>>>> 210+-+Provide+for+custom+
>> >>>>>>>> error+handling++when+Kafka+
>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Streams+fails+to+produce
>> >>>>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Looking forward to getting some feedback from
>> >>>>>>> folks
>> >>>>>>>>>>>>>> about
>> >>>>>>>>>>>>>>>>> this
>> >>>>>>>>>>>>>>>>>>>> idea
>> >>>>>>>>>>>>>>>>>>>>>>>> and
>> >>>>>>>>>>>>>>>>>>>>>>>>>>> working toward a solution we can contribute
>> back.
>> >>>>>>> :)
>> >>>>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Cheers,
>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Matt Farmer
>> >>>>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> --
>> >>>>>>>>>>>>>>>>>>>> -- Guozhang
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>> --
>> >>>>>>>>>>>>>>>>> -- Guozhang
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> --
>> >>>>>>>>>>>>>> -- Guozhang
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> --
>> >>>>>>>>> -- Guozhang
>> >>>>>>>>>
>> >>>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> --
>> >>>>>>> -- Guozhang
>> >>>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>>
>> >
>>
>>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message