kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias J. Sax" <matth...@confluent.io>
Subject Re: [DISCUSS] KIP-93: Improve invalid timestamp handling in Kafka Streams
Date Thu, 01 Dec 2016 00:28:14 GMT
Thanks for this clarification (and your +1)

I completely agree and just want to add my thoughts:

1. Yes, it is a backward compatible change but as I discusses with
others, we want accept this for now. All previous releases did contain
non-compatible changes for Kafka Streams, too. And as Kafka Streams API
is not guaranteed to be stable at this point, we better do breaking
changes now than later.

2. At some point, we need to be more conservative with breaking chances
and only allow them for major releases.

3. As we expect that most people do use default timestamp extractor,
they will not be effected anyway. Only if custom extractors are used,
the application needs to be recompiled. Thus, the effort to make the
change backward compatible seems not to be worth the effort.


-Matthias

On 11/29/16 9:57 PM, Ewen Cheslack-Postava wrote:
> I think this looks reasonable, but just a more general note on
> compatibility -- I think it's worth trying to clarify what types of
> compatibility we're trying to achieve. Guozhang's 1 & 2 give an important
> breakdown (compile vs runtime compatibility). For the type of change
> described here, I think it makes sense to clarify the compatibility goals.
> The (pure) compile time compatibility vs (pure) runtime compatibility
> aren't the only options -- you have some additional intermediate choices as
> well.
> 
> The proposal describes a change which requires recompiling the plugin
> (TimestampExtractor) *and* substituting a runtime library (kafka-streams
> jar) to get correct updated behavior. This can get interesting if you
> already have N streams apps sharing the same TimestampExtractor. You now
> *must* update all of them to the new streams jar if any are to be updated
> for the new TimestampExtractor API. For folks with a monolithic
> repo/versioning setup, this could potentially be painful since they're
> forced to update all apps at once. It's at least not too bad since it can
> be isolated to a single commit (without deployment needing to be
> coordinated, for example), but if the # of apps gets > 4 or 5, these types
> of updates start to be a real pain.
> 
> I think this API change is an acceptable (albeit annoying) API
> incompatibility right now, but wanted to raise this in the discussion of
> this KIP so we consider this moving forward. There definitely are
> alternatives that add the new functionality but maintain compatibility
> better. In particular, it's possible to define the new interface to require
> both APIs:
> 
> // new interface
> public interface TimestampExtractor {
>     long extract(ConsumerRecord<Object, Object> record);
>     long extract(ConsumerRecord<Object, Object> record, long
> previousTimestamp);
> }
> 
> which requires more effort for the implementor of the new API, but
> maintains compatibility if you want to use a new jar including the
> TimestampExtractor even with the old version of streams/the
> TimestampExtractor interface (since it will correctly dispatch to the old
> implementation). It requires more effort on the part of the framework since
> it needs to catch runtime exceptions when the second version of extract()
> is missing and fall back to the first version. But in some cases that might
> be warranted for the sake of compatibility.
> 
> I suspect this update won't cause too much pain right now just because the
> number of streams app any user has won't be too large quite yet, but this
> seems important to consider moving forward. I think we had some similar
> concerns & discussion around the changes to the consumer APIs when trying
> to generalize the collection types used in those APIs.
> 
> -Ewen
> 
> 
> On Mon, Nov 28, 2016 at 10:46 AM, Matthias J. Sax <matthias@confluent.io>
> wrote:
> 
>> Done.
>>
>> If there is no further comments, I would like to start a voting thread
>> in a separate email.
>>
>> -Matthias
>>
>> On 11/28/16 9:08 AM, Guozhang Wang wrote:
>>> Yes it does not include these, again in my previous previous email I
>> meant
>>> when you say "This is a breaking, incompatible change" people may
>> interpret
>>> it differently. So better explain it more clearly.
>>>
>>>
>>> Guozhang
>>>
>>> On Thu, Nov 24, 2016 at 10:31 PM, Matthias J. Sax <matthias@confluent.io
>>>
>>> wrote:
>>>
>>>> That does make sense. But KIP-93 does not change anything like this, so
>>>> there is nothing to mention, IMHO.
>>>>
>>>> Or do you mean, the KIP should include that the change is backward
>>>> compatible with this regard?
>>>>
>>>> -Matthias
>>>>
>>>>
>>>>
>>>> On 11/24/16 5:31 PM, Guozhang Wang wrote:
>>>>> What I meant is that, for some changes (e.g. say we change the
>>>>> auto-repartition behavior that caused using different name conventions,
>>>> or
>>>>> some changes that involve changing the underlying state store names,
>> etc)
>>>>> the existing internal state including the stores and topics will
>> probably
>>>>> not valid. Some users consider this also as a "backward incompatible
>>>>> change" since they cannot just swipe in the new jar and restart.
>>>>>
>>>>>
>>>>> Guozhang
>>>>>
>>>>>
>>>>> On Wed, Nov 23, 2016 at 3:20 PM, Matthias J. Sax <
>> matthias@confluent.io>
>>>>> wrote:
>>>>>
>>>>>> Thanks for the feedback. I updated the KIP for (1) and (2).
>>>>>>
>>>>>> However not for (3): Why should it be required to reset an
>> application?
>>>>>> If user processed "good" data with valid timestamps, behavior does
not
>>>>>> change. If user tried to process "bad" data with invalid timestamps,
>> the
>>>>>> application does fail currently anyway, so there is nothing to reset.
>>>>>>
>>>>>>
>>>>>> -Matthias
>>>>>>
>>>>>> On 11/22/16 9:53 AM, Guozhang Wang wrote:
>>>>>>> Regarding the "compatibility" section, I would suggest being
a bit
>> more
>>>>>>> specific about why it is a breaking change. For Streams, it could
>> mean
>>>>>>> different things:
>>>>>>>
>>>>>>> 1. User need code change when switching library dependency on
the new
>>>>>>> version, otherwise it won't compile(I think this is the case
for this
>>>>>> KIP).
>>>>>>> 2. User need code change when switching library dependency on
the new
>>>>>>> version, otherwise runtime exception will be thrown.
>>>>>>> 3. Existing application state as well as internal topics need
to be
>>>>>> swiped
>>>>>>> and the program need to restart from zero.
>>>>>>>
>>>>>>>
>>>>>>> Guozhang
>>>>>>>
>>>>>>> On Fri, Nov 18, 2016 at 12:27 PM, Matthias J. Sax <
>>>> matthias@confluent.io
>>>>>>>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi all,
>>>>>>>>
>>>>>>>> I want to start a discussion about KIP-93:
>>>>>>>>
>>>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>>>>>> 93%3A+Improve+invalid+timestamp+handling+in+Kafka+Streams
>>>>>>>>
>>>>>>>> Looking forward to your feedback.
>>>>>>>>
>>>>>>>>
>>>>>>>> -Matthias
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>
>>
> 
> 


Mime
View raw message