kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Matthias J. Sax" <matth...@confluent.io>
Subject Re: [DISCUSS] KIP-372: Naming Joins and Grouping
Date Thu, 13 Sep 2018 13:45:34 GMT
I don't know what Samza does, however, Flink requires users to specify
names similar to this proposal to be able to re-identify state in case
the topology gets altered between deployments.

Flink only has state they need to worry about. For Kafka Streams, it's
state plus repartition topics.


-Matthias

On 9/13/18 1:48 AM, Eno Thereska wrote:
> Hi folks,
> 
> I know we don't normally have a "Related work" section in KIPs, but
> sometimes I find it useful to see what others have done in similar cases.
> Since this will be important for rolling re-deployments, I wonder what
> other frameworks like Flink (or Samza) have done in these cases. Perhaps
> they have done nothing, in which case it's fine to do this from first
> principles, but IMO it would be good to know just to make sure we're
> heading in the right direction.
> 
> Also I don't get a good feel for how much work this will be for an end user
> who is doing the rolling deployment, perhaps an end-to-end example would
> help.
> 
> Thanks
> Eno
> 
> On Thu, Sep 13, 2018 at 6:22 AM, Matthias J. Sax <matthias@confluent.io>
> wrote:
> 
>> Follow up comments:
>>
>> 1) We should either use `[app-id]-this|other-[join-name]-repartition` or
>> `app-id]-[join-name]-left|right-repartition` but we should not change
>> the pattern depending if the user specifies a name of not. I am fine
>> with both patterns---just want to make sure with stick with one.
>>
>> 2) I didn't see why we would need to do this in this KIP. KIP-307 seems
>> to be orthogonal, and thus KIP-372 should not change any processor
>> names, but KIP-307 should define a holistic strategy for all processor.
>> Otherwise, we might up with different strategies or revert what we
>> decide in this KIP if it's not compatible with KIP-307.
>>
>>
>> -Matthias
>>
>>
>> On 9/12/18 6:28 PM, Guozhang Wang wrote:
>>> Hello Bill,
>>>
>>> I made a pass over your proposal and here are some questions:
>>>
>>> 1. For Joined names, the current proposal is to define the repartition
>>> topic names as
>>>
>>> * [app-id]-this-[join-name]-repartition
>>>
>>> * [app-id]-other-[join-name]-repartition
>>>
>>>
>>> And if [join-name] not specified, stay the same, which is:
>>>
>>> * [previous-processor-name]-repartition for both Stream-Stream (S-S)
>> join
>>> and S-T join
>>>
>>> I think it is more natural to rename it to
>>>
>>> * [app-id]-[join-name]-left-repartition
>>>
>>> * [app-id]-[join-name]-right-repartition
>>>
>>>
>>> 2. I'd suggest to use the name to also define the corresponding processor
>>> names accordingly, in addition to the repartition topic names. Note that
>>> for joins, this may be overlapping with KIP-307
>>> <https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> 307%3A+Allow+to+define+custom+processor+names+with+KStreams+DSL>
>>> as
>>> it also have proposals for defining processor names for join operators as
>>> well.
>>>
>>> 3. Could you also specify how this would affect the optimization for
>>> merging multiple repartition topics?
>>>
>>> 4. In the "Compatibility, Deprecation, and Migration Plan" section, could
>>> you also mention the following scenarios, if any of the upgrade path
>> would
>>> be changed:
>>>
>>>  a) changing user DSL code: under which scenarios users can now do a
>>> rolling bounce instead of resetting applications.
>>>
>>>  b) upgrading from older version to new version, with all the names
>>> specified, and with optimization turned on. E.g. say we have the code
>>> written in 2.1 with all names specified, and now upgrading to 2.2 with
>> new
>>> optimizations that may potentially change the repartition topics. Is that
>>> always safe to do?
>>>
>>>
>>>
>>> Guozhang
>>>
>>>
>>> On Wed, Sep 12, 2018 at 4:52 PM, Bill Bejeck <bbejeck@gmail.com> wrote:
>>>
>>>> All I'd like to start a discussion on KIP-372 for the naming of joins
>> and
>>>> grouping operations in Kafka Streams.
>>>>
>>>> The KIP page can be found here:
>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>> 372%3A+Naming+Joins+and+Grouping
>>>>
>>>> I look forward to feedback and comments.
>>>>
>>>> Thanks,
>>>> Bill
>>>>
>>>
>>>
>>>
>>
>>
> 


Mime
View raw message