kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tom Bentley <t.j.bent...@gmail.com>
Subject Re: [DISCUSS] KIP-179: Change ReassignPartitionsCommand to use AdminClient
Date Tue, 01 Aug 2017 09:19:22 GMT
I have added the timeout I mentioned before, because it makes the
implementation of topic alteration more symmetric with the topic creation
APIs.

I have also added a section ("Policy") on retrofitting the
CreateTopicPolicy's rules to topic alteration requests, and made a few
other minor fixes.

I've still not factored out the topic name from the ReplicaStatusRequest
(Ismael, do you have an opinion about that?)

If any one else has some feedback on this KIP that would be great.
Otherwise, I would like to start a vote on this KIP before the end of the
week.

Thanks,

Tom

On 26 July 2017 at 11:45, Tom Bentley <t.j.bentley@gmail.com> wrote:

> I've updated the KIP to fix those niggles, but I've not factored out the
> topic name from the ReplicaStatusRequest, yet.
>
> Looking at the topic creation APIs in more detail, the CreateTopicsOptions
> has
>
> * `shouldValidateOnly()`, which would make a lot of sense for the alter
> topic APIs
> * `timeoutMs()`, which I'm not sure sure about...
>
> Topic creation doesn't require shifting replicas between brokers so it's
> reasonable to support timeout, because we don't expect it to take very long.
>
> Topic alteration usually takes a while because we are going to have to
> move replicas. Since we're adding a whole API to track the progress of that
> replication, I'm inclined to think that having a timeout is a bit pointless.
>
> But should the replicaStatus() API have a timeout? I suppose it probably
> should.
>
>
> On 26 July 2017 at 10:58, Tom Bentley <t.j.bentley@gmail.com> wrote:
>
>> Thanks Paolo,
>>
>>   *   in the "Public Interfaces" section you wrote
>>> alterTopics(Set<AlterTopics>) but then a collection is used (instead of
a
>>> set) in the Proposed Changes section. I'm ok with collection.
>>>
>>
>> Agree it should be Collection.
>>
>>
>>>   *   in the summary of the alterTopics method you say "The request can
>>> change the number of partitions, replication factor and/or the partition
>>> assignments." I think that the "and/or" is misleading (at least for me).
>>> For the TopicCommand tool you can specify partitions AND replication factor
>>> ... OR partition assignments (but not for example partitions AND
>>> replication factor AND partition assignments). Maybe it could be "The
>>> request can change the number of partitions and the related replication
>>> factor or specifying a partition assignments."
>>>
>>
>> Is there a reason why we can't support changing all three at once? It
>> certainly makes conceptual sense to, say, increase the number of partitions
>> and replication factor, and specify how you want the partitions assigned.
>> And doing two separate calls would be less efficient as we sync new
>> replicas on brokers only to then move them somewhere else.
>>
>> If there is a reason we don't want to support changing all three, then we
>> can return the error INVALID_REQUEST (42). That would allow us to support
>> changing all three at some time in the future, without having to change the
>> API.
>>
>>
>>>   *   I know that it would be a breaking change in the Admin Client API
>>> but why having an AlteredTopic class which is quite similar to the already
>>> existing NewTopic class ? I know that using NewTopic for the alter method
>>> could be misleading. What about renaming NewTopic in something like
>>> AdminTopic ? At same time it means that the TopicDetails class (which you
>>> can get from the current NewTopic) should be outside the
>>> CreateTopicsRequest because it could be used in the AlterTopicsRequest as
>>> well.
>>>
>>
>> One problem with this is it tends to inhibit future API changes for
>> either newTopics() or alterTopics(), because any common class needs to make
>> sense in both contexts.
>>
>> For createTopics() we get to specify some configs (the
>> Map<String,String>), but since the AdminClient already has alterConfigs()
>> for changing topic configs after topic creation I don't think it's right to
>> also support changing those configs via alterTopics() as well. But having
>> them in a common AdminTopic class would suggest that that was supported.
>> Yes, alterTopics could return INVALID_REQUEST if it was given topic
>> configs, but this is just making the API harder to use since it is
>> weakening of the type safety of the API.
>>
>> I suppose we *could* write a common TopicDetails class and make the
>> existing nested one extend the common one, with deprecations, to eventually
>> remove the nested one.
>>
>>
>>
>>>   *   A typo in the ReplicaStatus : gpartition() instead of partition()
>>>   *   In the AlterTopicRequets
>>>      *   the replication factor should be INT16
>>>
>>
>> Ah, thanks!
>>
>>
>>>      *   I would use same fields name as CreateTopicsRequest (they are
>>> quite similar)
>>>   *   What's the broker id in the ReplicaStatusRequest ?
>>>
>>
>> It's the broker, which is expected to have a replica of the given
>> partition, that we're querying the status of. It is necessary because we're
>> asking the _leader_ for the partition about (a subset of) the status of the
>> followers. Put another way, to identify the replica of a particular
>> partition on a particular broker we need the tuple (topic, partition,
>> broker).
>>
>> If we were always interested in the status of the partition across all
>> brokers we could omit the broker part. But for reassignment we actually
>> only care about a subset of the brokers.
>>
>>
>>>   *   Thinking aloud, could make sense having "Partition" in the
>>> ReplicaStatusRequest as an array so that I can specify in only one request
>>> the status for partitions I'm interested in, in order to avoid e request
>>> for each partition for the same topic. Maybe empty array could mean ..
>>> "give me status for all partitions of this topic". Of course it means that
>>> the ReplicaStatusResponse should change because we should have an array
>>> with partition, broker, lag and so on
>>>
>>
>> You already can specify in one request the status for all the partitions
>> you're interested in (ReplicaStatus can be repeated/is an array field).
>>
>> We could factor out the topic to avoid repeating it, which would be more
>> efficient when we're querying the status of many partitions of a topic
>> and/or there are many brokers holding replicas. In other words, we could
>> factor it to look like this:
>>
>> ReplicaStatusRequest => [TopicReplicas]
>>   TopicReplicas => Topic [PartitionReplica]
>>     Topic => string
>>     PartitionReplica => Partition Broker
>>       Partition => int32
>>       Broker => int32
>>
>> Does this make sense to you?
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message