kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Guozhang Wang <wangg...@gmail.com>
Subject Re: [DISCUSS] KIP-93: Improve invalid timestamp handling in Kafka Streams
Date Fri, 02 Dec 2016 01:07:44 GMT
You mean "it is a backward incompatible change" right?

On Wed, Nov 30, 2016 at 4:28 PM, Matthias J. Sax <matthias@confluent.io>
wrote:

> Thanks for this clarification (and your +1)
>
> I completely agree and just want to add my thoughts:
>
> 1. Yes, it is a backward compatible change but as I discusses with
> others, we want accept this for now. All previous releases did contain
> non-compatible changes for Kafka Streams, too. And as Kafka Streams API
> is not guaranteed to be stable at this point, we better do breaking
> changes now than later.
>
> 2. At some point, we need to be more conservative with breaking chances
> and only allow them for major releases.
>
> 3. As we expect that most people do use default timestamp extractor,
> they will not be effected anyway. Only if custom extractors are used,
> the application needs to be recompiled. Thus, the effort to make the
> change backward compatible seems not to be worth the effort.
>
>
> -Matthias
>
> On 11/29/16 9:57 PM, Ewen Cheslack-Postava wrote:
> > I think this looks reasonable, but just a more general note on
> > compatibility -- I think it's worth trying to clarify what types of
> > compatibility we're trying to achieve. Guozhang's 1 & 2 give an important
> > breakdown (compile vs runtime compatibility). For the type of change
> > described here, I think it makes sense to clarify the compatibility
> goals.
> > The (pure) compile time compatibility vs (pure) runtime compatibility
> > aren't the only options -- you have some additional intermediate choices
> as
> > well.
> >
> > The proposal describes a change which requires recompiling the plugin
> > (TimestampExtractor) *and* substituting a runtime library (kafka-streams
> > jar) to get correct updated behavior. This can get interesting if you
> > already have N streams apps sharing the same TimestampExtractor. You now
> > *must* update all of them to the new streams jar if any are to be updated
> > for the new TimestampExtractor API. For folks with a monolithic
> > repo/versioning setup, this could potentially be painful since they're
> > forced to update all apps at once. It's at least not too bad since it can
> > be isolated to a single commit (without deployment needing to be
> > coordinated, for example), but if the # of apps gets > 4 or 5, these
> types
> > of updates start to be a real pain.
> >
> > I think this API change is an acceptable (albeit annoying) API
> > incompatibility right now, but wanted to raise this in the discussion of
> > this KIP so we consider this moving forward. There definitely are
> > alternatives that add the new functionality but maintain compatibility
> > better. In particular, it's possible to define the new interface to
> require
> > both APIs:
> >
> > // new interface
> > public interface TimestampExtractor {
> >     long extract(ConsumerRecord<Object, Object> record);
> >     long extract(ConsumerRecord<Object, Object> record, long
> > previousTimestamp);
> > }
> >
> > which requires more effort for the implementor of the new API, but
> > maintains compatibility if you want to use a new jar including the
> > TimestampExtractor even with the old version of streams/the
> > TimestampExtractor interface (since it will correctly dispatch to the old
> > implementation). It requires more effort on the part of the framework
> since
> > it needs to catch runtime exceptions when the second version of extract()
> > is missing and fall back to the first version. But in some cases that
> might
> > be warranted for the sake of compatibility.
> >
> > I suspect this update won't cause too much pain right now just because
> the
> > number of streams app any user has won't be too large quite yet, but this
> > seems important to consider moving forward. I think we had some similar
> > concerns & discussion around the changes to the consumer APIs when trying
> > to generalize the collection types used in those APIs.
> >
> > -Ewen
> >
> >
> > On Mon, Nov 28, 2016 at 10:46 AM, Matthias J. Sax <matthias@confluent.io
> >
> > wrote:
> >
> >> Done.
> >>
> >> If there is no further comments, I would like to start a voting thread
> >> in a separate email.
> >>
> >> -Matthias
> >>
> >> On 11/28/16 9:08 AM, Guozhang Wang wrote:
> >>> Yes it does not include these, again in my previous previous email I
> >> meant
> >>> when you say "This is a breaking, incompatible change" people may
> >> interpret
> >>> it differently. So better explain it more clearly.
> >>>
> >>>
> >>> Guozhang
> >>>
> >>> On Thu, Nov 24, 2016 at 10:31 PM, Matthias J. Sax <
> matthias@confluent.io
> >>>
> >>> wrote:
> >>>
> >>>> That does make sense. But KIP-93 does not change anything like this,
> so
> >>>> there is nothing to mention, IMHO.
> >>>>
> >>>> Or do you mean, the KIP should include that the change is backward
> >>>> compatible with this regard?
> >>>>
> >>>> -Matthias
> >>>>
> >>>>
> >>>>
> >>>> On 11/24/16 5:31 PM, Guozhang Wang wrote:
> >>>>> What I meant is that, for some changes (e.g. say we change the
> >>>>> auto-repartition behavior that caused using different name
> conventions,
> >>>> or
> >>>>> some changes that involve changing the underlying state store names,
> >> etc)
> >>>>> the existing internal state including the stores and topics will
> >> probably
> >>>>> not valid. Some users consider this also as a "backward incompatible
> >>>>> change" since they cannot just swipe in the new jar and restart.
> >>>>>
> >>>>>
> >>>>> Guozhang
> >>>>>
> >>>>>
> >>>>> On Wed, Nov 23, 2016 at 3:20 PM, Matthias J. Sax <
> >> matthias@confluent.io>
> >>>>> wrote:
> >>>>>
> >>>>>> Thanks for the feedback. I updated the KIP for (1) and (2).
> >>>>>>
> >>>>>> However not for (3): Why should it be required to reset an
> >> application?
> >>>>>> If user processed "good" data with valid timestamps, behavior
does
> not
> >>>>>> change. If user tried to process "bad" data with invalid timestamps,
> >> the
> >>>>>> application does fail currently anyway, so there is nothing
to
> reset.
> >>>>>>
> >>>>>>
> >>>>>> -Matthias
> >>>>>>
> >>>>>> On 11/22/16 9:53 AM, Guozhang Wang wrote:
> >>>>>>> Regarding the "compatibility" section, I would suggest being
a bit
> >> more
> >>>>>>> specific about why it is a breaking change. For Streams,
it could
> >> mean
> >>>>>>> different things:
> >>>>>>>
> >>>>>>> 1. User need code change when switching library dependency
on the
> new
> >>>>>>> version, otherwise it won't compile(I think this is the
case for
> this
> >>>>>> KIP).
> >>>>>>> 2. User need code change when switching library dependency
on the
> new
> >>>>>>> version, otherwise runtime exception will be thrown.
> >>>>>>> 3. Existing application state as well as internal topics
need to be
> >>>>>> swiped
> >>>>>>> and the program need to restart from zero.
> >>>>>>>
> >>>>>>>
> >>>>>>> Guozhang
> >>>>>>>
> >>>>>>> On Fri, Nov 18, 2016 at 12:27 PM, Matthias J. Sax <
> >>>> matthias@confluent.io
> >>>>>>>
> >>>>>>> wrote:
> >>>>>>>
> >>>>>>>> Hi all,
> >>>>>>>>
> >>>>>>>> I want to start a discussion about KIP-93:
> >>>>>>>>
> >>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>>>>>>> 93%3A+Improve+invalid+timestamp+handling+in+Kafka+Streams
> >>>>>>>>
> >>>>>>>> Looking forward to your feedback.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> -Matthias
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>>
> >>
> >>
> >
> >
>
>


-- 
-- Guozhang

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message