Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id DD9BE200CD7 for ; Tue, 1 Aug 2017 11:19:35 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id DC09D166C34; Tue, 1 Aug 2017 09:19:35 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 06639166C33 for ; Tue, 1 Aug 2017 11:19:34 +0200 (CEST) Received: (qmail 24670 invoked by uid 500); 1 Aug 2017 09:19:34 -0000 Mailing-List: contact dev-help@kafka.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@kafka.apache.org Delivered-To: mailing list dev@kafka.apache.org Received: (qmail 24658 invoked by uid 99); 1 Aug 2017 09:19:33 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 01 Aug 2017 09:19:33 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id 3CC7AC0221 for ; Tue, 1 Aug 2017 09:19:33 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.379 X-Spam-Level: ** X-Spam-Status: No, score=2.379 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RCVD_IN_SORBS_SPAM=0.5, SPF_PASS=-0.001] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id 0MpXsM_7Xd5B for ; Tue, 1 Aug 2017 09:19:30 +0000 (UTC) Received: from mail-wm0-f47.google.com (mail-wm0-f47.google.com [74.125.82.47]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 355ED5F476 for ; Tue, 1 Aug 2017 09:19:30 +0000 (UTC) Received: by mail-wm0-f47.google.com with SMTP id m85so8901056wma.1 for ; Tue, 01 Aug 2017 02:19:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to; bh=tvnTvHN11ssV4o04Vg/fiNsvmRlynvqOV4rJvq+LmyQ=; b=kfSePBzMfFpmkeo4gfCKZt/fKMv46Uysi1tcqF4GdTZkRKta7s6VxY1RodoFkGLdQC eXcGgg1yG4m/X4xuc3hkdKrWsBumtJceRXi+LjBP5dNJXiiDjCb0HaBNR7b37JUgBFRg pcx2n/DLiSPR4gx2lsxLx9uxWIdJjuxjqBzwBzfktUmqYUznNveUctsxYHZ6Tjn1Gh9F KsukjcQPJaslGIXHfv8o3dse7Bcc6erAtN+EuqPdg9EhnNA4H/EcDPE7/+NHB50Ln4bP xH5gIKQu7nGo6DzwFfqBJ2GCrpIKIOubnqoLAI7/GzsHTYv6TWrX8PxjwFXJQEHgxxd1 YZcg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to; bh=tvnTvHN11ssV4o04Vg/fiNsvmRlynvqOV4rJvq+LmyQ=; b=IYv7yqWfG2+YrNQVpwtg7jWI7DJdseRhFCFvRZS8Ldmk42eyXztIMby4YmPcCnFWGT ZDISfb8eBvIVvXMmh7lMSk2Pm4mqMGlkt9ym4r2ICbFvp9fzP3KV5x03wdyatn/oHd7c ewNJcqR+L8rSavOdQa0Dq/fobqRP/quDn1eLvTrhJqkWz5hwZ6sqI2oCVMPn+hHGc6mX /yc+aNNh7f1SEyPchg7w75iFbP3TIE5OX5ehIXSxVFPhV7SaQmb7oeDFi7cSo+abZKPg Iu28YXfhRIjfDwKwgJqOiVaOyok/Shq0TVXrmYEZFQE/xy7GnnMTBSnl7uPIyBINmCH0 3FLQ== X-Gm-Message-State: AIVw111s+uo7+QXUUK47gCR3HMwsUiXnChQ1UBXmNJxJvz0iSNgqGXu4 R8/BcvWtkrDgvtMkXIkQBKg+wgn8/ZUsfRo= X-Received: by 10.28.157.81 with SMTP id g78mr952279wme.76.1501579163605; Tue, 01 Aug 2017 02:19:23 -0700 (PDT) MIME-Version: 1.0 Received: by 10.28.84.94 with HTTP; Tue, 1 Aug 2017 02:19:22 -0700 (PDT) In-Reply-To: References: From: Tom Bentley Date: Tue, 1 Aug 2017 10:19:22 +0100 Message-ID: Subject: Re: [DISCUSS] KIP-179: Change ReassignPartitionsCommand to use AdminClient To: dev@kafka.apache.org Content-Type: multipart/alternative; boundary="001a114ba16aa61f1e0555ada49e" archived-at: Tue, 01 Aug 2017 09:19:36 -0000 --001a114ba16aa61f1e0555ada49e Content-Type: text/plain; charset="UTF-8" I have added the timeout I mentioned before, because it makes the implementation of topic alteration more symmetric with the topic creation APIs. I have also added a section ("Policy") on retrofitting the CreateTopicPolicy's rules to topic alteration requests, and made a few other minor fixes. I've still not factored out the topic name from the ReplicaStatusRequest (Ismael, do you have an opinion about that?) If any one else has some feedback on this KIP that would be great. Otherwise, I would like to start a vote on this KIP before the end of the week. Thanks, Tom On 26 July 2017 at 11:45, Tom Bentley wrote: > I've updated the KIP to fix those niggles, but I've not factored out the > topic name from the ReplicaStatusRequest, yet. > > Looking at the topic creation APIs in more detail, the CreateTopicsOptions > has > > * `shouldValidateOnly()`, which would make a lot of sense for the alter > topic APIs > * `timeoutMs()`, which I'm not sure sure about... > > Topic creation doesn't require shifting replicas between brokers so it's > reasonable to support timeout, because we don't expect it to take very long. > > Topic alteration usually takes a while because we are going to have to > move replicas. Since we're adding a whole API to track the progress of that > replication, I'm inclined to think that having a timeout is a bit pointless. > > But should the replicaStatus() API have a timeout? I suppose it probably > should. > > > On 26 July 2017 at 10:58, Tom Bentley wrote: > >> Thanks Paolo, >> >> * in the "Public Interfaces" section you wrote >>> alterTopics(Set) but then a collection is used (instead of a >>> set) in the Proposed Changes section. I'm ok with collection. >>> >> >> Agree it should be Collection. >> >> >>> * in the summary of the alterTopics method you say "The request can >>> change the number of partitions, replication factor and/or the partition >>> assignments." I think that the "and/or" is misleading (at least for me). >>> For the TopicCommand tool you can specify partitions AND replication factor >>> ... OR partition assignments (but not for example partitions AND >>> replication factor AND partition assignments). Maybe it could be "The >>> request can change the number of partitions and the related replication >>> factor or specifying a partition assignments." >>> >> >> Is there a reason why we can't support changing all three at once? It >> certainly makes conceptual sense to, say, increase the number of partitions >> and replication factor, and specify how you want the partitions assigned. >> And doing two separate calls would be less efficient as we sync new >> replicas on brokers only to then move them somewhere else. >> >> If there is a reason we don't want to support changing all three, then we >> can return the error INVALID_REQUEST (42). That would allow us to support >> changing all three at some time in the future, without having to change the >> API. >> >> >>> * I know that it would be a breaking change in the Admin Client API >>> but why having an AlteredTopic class which is quite similar to the already >>> existing NewTopic class ? I know that using NewTopic for the alter method >>> could be misleading. What about renaming NewTopic in something like >>> AdminTopic ? At same time it means that the TopicDetails class (which you >>> can get from the current NewTopic) should be outside the >>> CreateTopicsRequest because it could be used in the AlterTopicsRequest as >>> well. >>> >> >> One problem with this is it tends to inhibit future API changes for >> either newTopics() or alterTopics(), because any common class needs to make >> sense in both contexts. >> >> For createTopics() we get to specify some configs (the >> Map), but since the AdminClient already has alterConfigs() >> for changing topic configs after topic creation I don't think it's right to >> also support changing those configs via alterTopics() as well. But having >> them in a common AdminTopic class would suggest that that was supported. >> Yes, alterTopics could return INVALID_REQUEST if it was given topic >> configs, but this is just making the API harder to use since it is >> weakening of the type safety of the API. >> >> I suppose we *could* write a common TopicDetails class and make the >> existing nested one extend the common one, with deprecations, to eventually >> remove the nested one. >> >> >> >>> * A typo in the ReplicaStatus : gpartition() instead of partition() >>> * In the AlterTopicRequets >>> * the replication factor should be INT16 >>> >> >> Ah, thanks! >> >> >>> * I would use same fields name as CreateTopicsRequest (they are >>> quite similar) >>> * What's the broker id in the ReplicaStatusRequest ? >>> >> >> It's the broker, which is expected to have a replica of the given >> partition, that we're querying the status of. It is necessary because we're >> asking the _leader_ for the partition about (a subset of) the status of the >> followers. Put another way, to identify the replica of a particular >> partition on a particular broker we need the tuple (topic, partition, >> broker). >> >> If we were always interested in the status of the partition across all >> brokers we could omit the broker part. But for reassignment we actually >> only care about a subset of the brokers. >> >> >>> * Thinking aloud, could make sense having "Partition" in the >>> ReplicaStatusRequest as an array so that I can specify in only one request >>> the status for partitions I'm interested in, in order to avoid e request >>> for each partition for the same topic. Maybe empty array could mean .. >>> "give me status for all partitions of this topic". Of course it means that >>> the ReplicaStatusResponse should change because we should have an array >>> with partition, broker, lag and so on >>> >> >> You already can specify in one request the status for all the partitions >> you're interested in (ReplicaStatus can be repeated/is an array field). >> >> We could factor out the topic to avoid repeating it, which would be more >> efficient when we're querying the status of many partitions of a topic >> and/or there are many brokers holding replicas. In other words, we could >> factor it to look like this: >> >> ReplicaStatusRequest => [TopicReplicas] >> TopicReplicas => Topic [PartitionReplica] >> Topic => string >> PartitionReplica => Partition Broker >> Partition => int32 >> Broker => int32 >> >> Does this make sense to you? >> >> > --001a114ba16aa61f1e0555ada49e--