kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jan Filipiak <Jan.Filip...@trivago.com>
Subject Re: [DISCUSS] KIP-221: Repartition Topic Hints in Streams
Date Mon, 06 Nov 2017 20:18:20 GMT
Sorry for not beeing 100% up to date.
Back then we had the discussion that when an operation puts a >Sink< 
into the topology, a >Produced<
parameter is added. This produced parameter could have internal or 
external. If internal I think the name would still make
a great suffix for the topic name

Is this plan still around? Otherwise having the name as suffix is 
probably always good it can help the user quicker to identify hot topics 
that need more
partitions if he has many of these internal repartitions

Best Jan


On 06.11.2017 20:13, Matthias J. Sax wrote:
> I absolute agree with what you say. It's not a requirement to specify a
> topic name -- and this was the idea -- if user does specify a name, we
> treat as is -- if users does not specify a name, Streams create an
> internal topic.
>
> The goal of the Jira is to allow a simplified way to control
> repartitioning (atm, user needs to manually create a topic and use via
> through()).
>
> Thus, the idea is to make the topic name parameter of through optional.
>
> It's of course just an idea. Happy do have a other API design. The goal
> was, to avoid to many new overloads.
>
>>> Could you clarify exactly what you mean by keeping the current distinction?
> Current distinction is: user topics are created manually and user
> specifies the name -- internal topics are created by Kafka Streams and
> an name is generated automatically.
>
> -> through("user-topic")
> -> through(TopicConfig.withNumberOfPartitions(5)) // Streams creates an
> internal topic
>
>
> -Matthias
>
>
> On 11/6/17 6:56 PM, Thomas Becker wrote:
>> Could you clarify exactly what you mean by keeping the current distinction?
>>
>> Actually, re-reading the KIP and JIRA, it's not clear that being able to specify
a custom name is actually a requirement. If the goal is to control repartitioning and tune
parallelism, maybe we can just sidestep this issue altogether by removing the ability to set
a different name.
>>
>> On Mon, 2017-11-06 at 16:51 +0100, Matthias J. Sax wrote:
>>
>> That's a good point. In current design, we strictly distinguish both.
>> For example, the reset tools deletes internal topics (starting with
>> prefix `<application.id>-` and ending with either `-repartition` or
>> `-changelog`.
>>
>> Thus, from my point of view, it would make sense to keep the current
>> distinction.
>>
>> -Matthias
>>
>> On 11/6/17 4:45 PM, Thomas Becker wrote:
>>
>>
>> I think this sounds good as well. It's worth clarifying whether topics that are named
by the user but created by streams are considered "internal" topics also.
>>
>> On Sun, 2017-11-05 at 23:02 +0100, Matthias J. Sax wrote:
>>
>> My idea was, to relax the requirement for through() that a topic must be
>> created manually before startup.
>>
>> Thus, if no through() call is made, a (internal) topic is created the
>> same way we do it currently.
>>
>> If one uses `through(String topicName)` we keep the current behavior and
>> require users to create the topic manually.
>>
>> The reasoning is as follows: if a user creates a topic manually, a user
>> can just use it for repartitioning. As the topic is already there, there
>> is no need to specify any topic configs.
>>
>> We add a new `through()` overload (details TBD) that allows to specify
>> topic configs and Streams create the topic with those configs.
>>
>> Reasoning: user don't want to manage topic manually, thus, it's still an
>> internal topic and Streams create the topic name automatically as for
>> all other internal topics. However, users gets some more control about
>> topic parameters like number of partitions (we should discuss what other
>> configs would be useful).
>>
>>
>> Does this make sense?
>>
>>
>> -Matthias
>>
>>
>> On 11/5/17 1:21 AM, Jan Filipiak wrote:
>>
>>
>> Hi.
>>
>>
>> Im not 100 % up to date what version 1.0 DSL looks like ATM.
>> I just would argue that repartitioning should be an own API call like
>> through or something.
>> One can use through or to already to get this. I would argue one should
>> look there instead of overloads
>>
>> Best Jan
>>
>> On 04.11.2017 16:01, Jeyhun Karimov wrote:
>>
>>
>> Dear community,
>>
>> I would like to initiate discussion on KIP-221 [1] based on issue [2].
>> Please feel free to comment.
>>
>> [1]
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-221%3A+Repartition+Topic+Hints+in+Streams
>>
>> [2] https://issues.apache.org/jira/browse/KAFKA-6037
>>
>>
>>
>> Cheers,
>> Jeyhun
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> ________________________________
>>
>> This email and any attachments may contain confidential and privileged material for
the sole use of the intended recipient. Any review, copying, or distribution of this email
(or any attachments) by others is prohibited. If you are not the intended recipient, please
contact the sender immediately and permanently delete this email and any attachments. No employee
or agent of TiVo Inc. is authorized to conclude any binding agreement on behalf of TiVo Inc.
by email. Binding agreements with TiVo Inc. may only be made by a signed written agreement.
>>
>>
>>
>>
>>
>>
>> ________________________________
>>
>> This email and any attachments may contain confidential and privileged material for
the sole use of the intended recipient. Any review, copying, or distribution of this email
(or any attachments) by others is prohibited. If you are not the intended recipient, please
contact the sender immediately and permanently delete this email and any attachments. No employee
or agent of TiVo Inc. is authorized to conclude any binding agreement on behalf of TiVo Inc.
by email. Binding agreements with TiVo Inc. may only be made by a signed written agreement.
>>


Mime
View raw message