kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kowshik Prakasam <kpraka...@confluent.io>
Subject Re: [DISCUSS] KIP-584: Versioning scheme for features
Date Thu, 16 Apr 2020 01:28:25 GMT
Hi Jun,

Sorry the links were broken in my last response, here are the right links:

200. https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioning
Scheme For Features-Validations
<https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-Validations>
110. https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-When
To Use Versioned Feature Flags?
<https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-Whentouseversionedfeatureflags?>


Cheers,
Kowshik

On Wed, Apr 15, 2020 at 6:24 PM Kowshik Prakasam <kprakasam@confluent.io>
wrote:

>
> Hi Jun,
>
> Thanks for the feedback! I have addressed the comments in the KIP.
>
> > 200. In the validation section, there is still the text  "*from*
> > {"max_version_level":
> > X} *to* {"max_version_level": X’}". It seems that it should say "from X
> to
> > Y"?
>
> (Kowshik): Done. I have reworded it a bit to make it clearer now in this
> section:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-Validations
>
> > 110. Could we add that we need to document the bumped version of each
> > feature in the upgrade section of a release?
>
> (Kowshik): Great point! Done, I have mentioned it in #3 this section:
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584
> <https://issues.apache.org/jira/browse/KIP-584>
> %3A+Versioning+scheme+for+features#KIP-584
> <https://issues.apache.org/jira/browse/KIP-584>
> :Versioningschemeforfeatures-Whentouseversionedfeatureflags?
>
>
> Cheers,
> Kowshik
>
> On Wed, Apr 15, 2020 at 4:00 PM Jun Rao <jun@confluent.io> wrote:
>
>> Hi, Kowshik,
>>
>> Looks good to me now. Just a couple of minor things below.
>>
>> 200. In the validation section, there is still the text  "*from*
>> {"max_version_level":
>> X} *to* {"max_version_level": X’}". It seems that it should say "from X to
>> Y"?
>>
>> 110. Could we add that we need to document the bumped version of each
>> feature in the upgrade section of a release?
>>
>> Thanks,
>>
>> Jun
>>
>> On Wed, Apr 15, 2020 at 1:08 PM Kowshik Prakasam <kprakasam@confluent.io>
>> wrote:
>>
>> > Hi Jun,
>> >
>> > Thank you for the suggestion! I have updated the KIP, please find my
>> > response below.
>> >
>> > > 200. I guess you are saying only when the allowDowngrade field is set,
>> > the
>> > > finalized feature version can go backward. Otherwise, it can only go
>> up.
>> > > That makes sense. It would be useful to make that clear when
>> explaining
>> > > the usage of the allowDowngrade field. In the validation section, we
>> > have  "
>> > > /features' from {"max_version_level": X} to {"max_version_level":
>> X’}",
>> > it
>> > > seems that we need to mention Y there.
>> >
>> > (Kowshik): Great point! Yes, that is correct. Done, I have updated the
>> > validations
>> > section explaining the above. Here is a link to this section:
>> >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-Validations
>> >
>> >
>> > Cheers,
>> > Kowshik
>> >
>> >
>> >
>> >
>> > On Wed, Apr 15, 2020 at 11:05 AM Jun Rao <jun@confluent.io> wrote:
>> >
>> > > Hi, Kowshik,
>> > >
>> > > 200. I guess you are saying only when the allowDowngrade field is set,
>> > the
>> > > finalized feature version can go backward. Otherwise, it can only go
>> up.
>> > > That makes sense. It would be useful to make that clear when
>> explaining
>> > > the usage of the allowDowngrade field. In the validation section, we
>> have
>> > > "
>> > > /features' from {"max_version_level": X} to {"max_version_level":
>> X’}",
>> > it
>> > > seems that we need to mention Y there.
>> > >
>> > > Thanks,
>> > >
>> > > Jun
>> > >
>> > > On Wed, Apr 15, 2020 at 10:44 AM Kowshik Prakasam <
>> > kprakasam@confluent.io>
>> > > wrote:
>> > >
>> > > > Hi Jun,
>> > > >
>> > > > Great question! Please find my response below.
>> > > >
>> > > > > 200. My understanding is that If the CLI tool passes the
>> > > > > '--allow-downgrade' flag when updating a specific feature, then a
>> > > future
>> > > > > downgrade is possible. Otherwise, the feature is now
>> downgradable. If
>> > > so,
>> > > > I
>> > > > > was wondering how the controller remembers this since it can be
>> > > restarted
>> > > > > over time?
>> > > >
>> > > > (Kowshik): The purpose of the flag was to just restrict the user
>> intent
>> > > for
>> > > > a specific request.
>> > > > It seems to me that to avoid confusion, I could call the flag as
>> > > > `--try-downgrade` instead.
>> > > > Then this makes it clear, that, the controller just has to consider
>> the
>> > > ask
>> > > > from
>> > > > the user as an explicit request to attempt a downgrade.
>> > > >
>> > > > The flag does not act as an override on controller's decision making
>> > that
>> > > > decides whether
>> > > > a flag is downgradable (these decisions on whether to allow a flag
>> to
>> > be
>> > > > downgraded
>> > > > from a specific version level, can be embedded in the controller
>> code).
>> > > >
>> > > > Please let me know what you think.
>> > > > Sorry if I misunderstood the original question.
>> > > >
>> > > >
>> > > > Cheers,
>> > > > Kowshik
>> > > >
>> > > >
>> > > > On Wed, Apr 15, 2020 at 9:40 AM Jun Rao <jun@confluent.io> wrote:
>> > > >
>> > > > > Hi, Kowshik,
>> > > > >
>> > > > > Thanks for the reply. Makes sense. Just one more question.
>> > > > >
>> > > > > 200. My understanding is that If the CLI tool passes the
>> > > > > '--allow-downgrade' flag when updating a specific feature, then a
>> > > future
>> > > > > downgrade is possible. Otherwise, the feature is now
>> downgradable. If
>> > > > so, I
>> > > > > was wondering how the controller remembers this since it can be
>> > > restarted
>> > > > > over time?
>> > > > >
>> > > > > Jun
>> > > > >
>> > > > >
>> > > > > On Tue, Apr 14, 2020 at 6:49 PM Kowshik Prakasam <
>> > > kprakasam@confluent.io
>> > > > >
>> > > > > wrote:
>> > > > >
>> > > > > > Hi Jun,
>> > > > > >
>> > > > > > Thanks a lot for the feedback and the questions!
>> > > > > > Please find my response below.
>> > > > > >
>> > > > > > > 200. The UpdateFeaturesRequest includes an AllowDowngrade
>> field.
>> > It
>> > > > > seems
>> > > > > > > that field needs to be persisted somewhere in ZK?
>> > > > > >
>> > > > > > (Kowshik): Great question! Below is my explanation. Please help
>> me
>> > > > > > understand,
>> > > > > > if you feel there are cases where we would need to still
>> persist it
>> > > in
>> > > > > ZK.
>> > > > > >
>> > > > > > Firstly I have updated my thoughts into the KIP now, under the
>> > > > > 'guidelines'
>> > > > > > section:
>> > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-Guidelinesonfeatureversionsandworkflows
>> > > > > >
>> > > > > > The allowDowngrade boolean field is just to restrict the user
>> > intent,
>> > > > and
>> > > > > > to remind
>> > > > > > them to double check their intent before proceeding. It should
>> be
>> > set
>> > > > to
>> > > > > > true
>> > > > > > by the user in a request, only when the user intent is to
>> > forcefully
>> > > > > > "attempt" a
>> > > > > > downgrade of a specific feature's max version level, to the
>> > provided
>> > > > > value
>> > > > > > in
>> > > > > > the request.
>> > > > > >
>> > > > > > We can extend this safeguard. The controller (on it's end) can
>> > > maintain
>> > > > > > rules in the code, that, for safety reasons would outright
>> reject
>> > > > certain
>> > > > > > downgrades
>> > > > > > from a specific max_version_level for a specific feature. Such
>> > > > rejections
>> > > > > > may
>> > > > > > happen depending on the feature being downgraded, and from what
>> > > version
>> > > > > > level.
>> > > > > >
>> > > > > > The CLI tool only allows a downgrade attempt in conjunction with
>> > > > specific
>> > > > > > flags and sub-commands. For example, in the CLI tool, if the
>> user
>> > > uses
>> > > > > the
>> > > > > > 'downgrade-all' command, or passes '--allow-downgrade' flag when
>> > > > > updating a
>> > > > > > specific feature, only then the tool will translate this ask to
>> > > setting
>> > > > > > 'allowDowngrade' field in the request to the server.
>> > > > > >
>> > > > > > > 201. UpdateFeaturesResponse has the following top level
>> fields.
>> > > > Should
>> > > > > > > those fields be per feature?
>> > > > > > >
>> > > > > > >   "fields": [
>> > > > > > >     { "name": "ErrorCode", "type": "int16", "versions": "0+",
>> > > > > > >       "about": "The error code, or 0 if there was no error."
>> },
>> > > > > > >     { "name": "ErrorMessage", "type": "string", "versions":
>> "0+",
>> > > > > > >       "about": "The error message, or null if there was no
>> > error."
>> > > }
>> > > > > > >   ]
>> > > > > >
>> > > > > > (Kowshik): Great question!
>> > > > > > As such, the API is transactional, as explained in the sections
>> > > linked
>> > > > > > below.
>> > > > > > Either all provided FeatureUpdate was applied, or none.
>> > > > > > It's the reason I felt we can have just one error code +
>> message.
>> > > > > > Happy to extend this if you feel otherwise. Please let me know.
>> > > > > >
>> > > > > > Link to sections:
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-ChangestoKafkaController
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-Guarantees
>> > > > > >
>> > > > > > > 202. The /features path in ZK has a field min_version_level.
>> > Which
>> > > > API
>> > > > > > and
>> > > > > > > tool can change that value?
>> > > > > >
>> > > > > > (Kowshik): Great question! Currently this cannot be modified by
>> > using
>> > > > the
>> > > > > > API or the tool.
>> > > > > > Feature version deprecation (by raising min_version_level) can
>> be
>> > > done
>> > > > > only
>> > > > > > by the Controller directly. The rationale is explained in this
>> > > section:
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-Featureversiondeprecation
>> > > > > >
>> > > > > >
>> > > > > > Cheers,
>> > > > > > Kowshik
>> > > > > >
>> > > > > > On Tue, Apr 14, 2020 at 5:33 PM Jun Rao <jun@confluent.io>
>> wrote:
>> > > > > >
>> > > > > > > Hi, Kowshik,
>> > > > > > >
>> > > > > > > Thanks for addressing those comments. Just a few more minor
>> > > comments.
>> > > > > > >
>> > > > > > > 200. The UpdateFeaturesRequest includes an AllowDowngrade
>> field.
>> > It
>> > > > > seems
>> > > > > > > that field needs to be persisted somewhere in ZK?
>> > > > > > >
>> > > > > > > 201. UpdateFeaturesResponse has the following top level
>> fields.
>> > > > Should
>> > > > > > > those fields be per feature?
>> > > > > > >
>> > > > > > >   "fields": [
>> > > > > > >     { "name": "ErrorCode", "type": "int16", "versions": "0+",
>> > > > > > >       "about": "The error code, or 0 if there was no error."
>> },
>> > > > > > >     { "name": "ErrorMessage", "type": "string", "versions":
>> "0+",
>> > > > > > >       "about": "The error message, or null if there was no
>> > error."
>> > > }
>> > > > > > >   ]
>> > > > > > >
>> > > > > > > 202. The /features path in ZK has a field min_version_level.
>> > Which
>> > > > API
>> > > > > > and
>> > > > > > > tool can change that value?
>> > > > > > >
>> > > > > > > Jun
>> > > > > > >
>> > > > > > > On Mon, Apr 13, 2020 at 5:12 PM Kowshik Prakasam <
>> > > > > kprakasam@confluent.io
>> > > > > > >
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > Hi Jun,
>> > > > > > > >
>> > > > > > > > Thanks for the feedback! I have updated the KIP-584
>> addressing
>> > > your
>> > > > > > > > comments.
>> > > > > > > > Please find my response below.
>> > > > > > > >
>> > > > > > > > > 100.6 You can look for the sentence "This operation
>> requires
>> > > > ALTER
>> > > > > on
>> > > > > > > > > CLUSTER." in KIP-455. Also, you can check its usage in
>> > > > > > > > > KafkaApis.authorize().
>> > > > > > > >
>> > > > > > > > (Kowshik): Done. Great point! For the newly introduced
>> > > > > UPDATE_FEATURES
>> > > > > > > api,
>> > > > > > > > I have added a
>> > > > > > > > requirement that AclOperation.ALTER is required on
>> > > > > > ResourceType.CLUSTER.
>> > > > > > > >
>> > > > > > > > > 110. Keeping the feature version as int is probably fine.
>> I
>> > > just
>> > > > > felt
>> > > > > > > > that
>> > > > > > > > > for some of the common user interactions, it's more
>> > convenient
>> > > to
>> > > > > > > > > relate that to a release version. For example, if a user
>> > wants
>> > > to
>> > > > > > > > downgrade
>> > > > > > > > > to a release 2.5, it's easier for the user to use the tool
>> > like
>> > > > > "tool
>> > > > > > > > > --downgrade 2.5" instead of "tool --downgrade --feature X
>> > > > --version
>> > > > > > 6".
>> > > > > > > >
>> > > > > > > > (Kowshik): Great point. Generally, maximum feature version
>> > levels
>> > > > are
>> > > > > > not
>> > > > > > > > downgradable after
>> > > > > > > > they are finalized in the cluster. This is because, as a
>> > > guideline
>> > > > > > > bumping
>> > > > > > > > feature version level usually is used mainly to convey
>> > important
>> > > > > > breaking
>> > > > > > > > changes.
>> > > > > > > > Despite the above, there may be some extreme/rare cases
>> where a
>> > > > user
>> > > > > > > wants
>> > > > > > > > to downgrade
>> > > > > > > > all features to a specific previous release. The user may
>> want
>> > to
>> > > > do
>> > > > > > this
>> > > > > > > > just
>> > > > > > > > prior to rolling back a Kafka cluster to a previous release.
>> > > > > > > >
>> > > > > > > > To support the above, I have made a change to the KIP
>> > explaining
>> > > > that
>> > > > > > the
>> > > > > > > > CLI tool is versioned.
>> > > > > > > > The CLI tool internally has knowledge about a map of
>> features
>> > to
>> > > > > their
>> > > > > > > > respective max
>> > > > > > > > versions supported by the Broker. The tool's knowledge of
>> > > features
>> > > > > and
>> > > > > > > > their version values,
>> > > > > > > > is limited to the version of the CLI tool itself i.e. the
>> > > > information
>> > > > > > is
>> > > > > > > > packaged into the CLI tool
>> > > > > > > > when it is released. Whenever a Kafka release introduces a
>> new
>> > > > > feature
>> > > > > > > > version, or modifies
>> > > > > > > > an existing feature version, the CLI tool shall also be
>> updated
>> > > > with
>> > > > > > this
>> > > > > > > > information,
>> > > > > > > > Newer versions of the CLI tool will be released as part of
>> the
>> > > > Kafka
>> > > > > > > > releases.
>> > > > > > > >
>> > > > > > > > Therefore, to achieve the downgrade need, the user just
>> needs
>> > to
>> > > > run
>> > > > > > the
>> > > > > > > > version of
>> > > > > > > > the CLI tool that's part of the particular previous release
>> > that
>> > > > > he/she
>> > > > > > > is
>> > > > > > > > downgrading to.
>> > > > > > > > To help the user with this, there is a new command added to
>> the
>> > > CLI
>> > > > > > tool
>> > > > > > > > called `downgrade-all`.
>> > > > > > > > This essentially downgrades max version levels of all
>> features
>> > in
>> > > > the
>> > > > > > > > cluster to the versions
>> > > > > > > > known to the CLI tool internally.
>> > > > > > > >
>> > > > > > > > I have explained the above in the KIP under these sections:
>> > > > > > > >
>> > > > > > > > Tooling support (have explained that the CLI tool is
>> > versioned):
>> > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-Toolingsupport
>> > > > > > > >
>> > > > > > > > Regular CLI tool usage (please refer to point #3, and see
>> the
>> > > > tooling
>> > > > > > > > example)
>> > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-RegularCLItoolusage
>> > > > > > > >
>> > > > > > > > > 110. Similarly, if the client library finds a feature
>> > mismatch
>> > > > with
>> > > > > > the
>> > > > > > > > broker,
>> > > > > > > > > the client likely needs to log some error message for the
>> > user
>> > > to
>> > > > > > take
>> > > > > > > > some
>> > > > > > > > > actions. It's much more actionable if the error message is
>> > > > "upgrade
>> > > > > > the
>> > > > > > > > > broker to release version 2.6" than just "upgrade the
>> broker
>> > to
>> > > > > > feature
>> > > > > > > > > version 7".
>> > > > > > > >
>> > > > > > > > (Kowshik): That's a really good point! If we use ints for
>> > feature
>> > > > > > > versions,
>> > > > > > > > the best
>> > > > > > > > message that client can print for debugging is "broker
>> doesn't
>> > > > > support
>> > > > > > > > feature version 7", and alongside that print the supported
>> > > version
>> > > > > > range
>> > > > > > > > returned
>> > > > > > > > by the broker. Then, does it sound reasonable that the user
>> > could
>> > > > > then
>> > > > > > > > reference
>> > > > > > > > Kafka release logs to figure out which version of the broker
>> > > > release
>> > > > > is
>> > > > > > > > required
>> > > > > > > > be deployed, to support feature version 7? I couldn't think
>> of
>> > a
>> > > > > better
>> > > > > > > > strategy here.
>> > > > > > > >
>> > > > > > > > > 120. When should a developer bump up the version of a
>> > feature?
>> > > > > > > >
>> > > > > > > > (Kowshik): Great question! In the KIP, I have added a
>> section:
>> > > > > > > 'Guidelines
>> > > > > > > > on feature versions and workflows'
>> > > > > > > > providing some guidelines on when to use the versioned
>> feature
>> > > > flags,
>> > > > > > and
>> > > > > > > > what
>> > > > > > > > are the regular workflows with the CLI tool.
>> > > > > > > >
>> > > > > > > > Link to the relevant sections:
>> > > > > > > > Guidelines:
>> > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-Guidelinesonfeatureversionsandworkflows
>> > > > > > > >
>> > > > > > > > Regular CLI tool usage:
>> > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-RegularCLItoolusage
>> > > > > > > >
>> > > > > > > > Advanced CLI tool usage:
>> > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-AdvancedCLItoolusage
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > Cheers,
>> > > > > > > > Kowshik
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > On Fri, Apr 10, 2020 at 4:25 PM Jun Rao <jun@confluent.io>
>> > > wrote:
>> > > > > > > >
>> > > > > > > > > Hi, Kowshik,
>> > > > > > > > >
>> > > > > > > > > Thanks for the reply. A few more comments.
>> > > > > > > > >
>> > > > > > > > > 110. Keeping the feature version as int is probably fine.
>> I
>> > > just
>> > > > > felt
>> > > > > > > > that
>> > > > > > > > > for some of the common user interactions, it's more
>> > convenient
>> > > to
>> > > > > > > > > relate that to a release version. For example, if a user
>> > wants
>> > > to
>> > > > > > > > downgrade
>> > > > > > > > > to a release 2.5, it's easier for the user to use the tool
>> > like
>> > > > > "tool
>> > > > > > > > > --downgrade 2.5" instead of "tool --downgrade --feature X
>> > > > --version
>> > > > > > 6".
>> > > > > > > > > Similarly, if the client library finds a feature mismatch
>> > with
>> > > > the
>> > > > > > > > broker,
>> > > > > > > > > the client likely needs to log some error message for the
>> > user
>> > > to
>> > > > > > take
>> > > > > > > > some
>> > > > > > > > > actions. It's much more actionable if the error message is
>> > > > "upgrade
>> > > > > > the
>> > > > > > > > > broker to release version 2.6" than just "upgrade the
>> broker
>> > to
>> > > > > > feature
>> > > > > > > > > version 7".
>> > > > > > > > >
>> > > > > > > > > 111. Sounds good.
>> > > > > > > > >
>> > > > > > > > > 120. When should a developer bump up the version of a
>> > feature?
>> > > > > > > > >
>> > > > > > > > > Jun
>> > > > > > > > >
>> > > > > > > > > On Tue, Apr 7, 2020 at 12:26 AM Kowshik Prakasam <
>> > > > > > > kprakasam@confluent.io
>> > > > > > > > >
>> > > > > > > > > wrote:
>> > > > > > > > >
>> > > > > > > > > > Hi Jun,
>> > > > > > > > > >
>> > > > > > > > > > I have updated the KIP for the item 111.
>> > > > > > > > > > I'm in the process of addressing 100.6, and will
>> provide an
>> > > > > update
>> > > > > > > > soon.
>> > > > > > > > > > I think item 110 is still under discussion given we are
>> now
>> > > > > > > providing a
>> > > > > > > > > way
>> > > > > > > > > > to finalize
>> > > > > > > > > > all features to their latest version levels. In any
>> case,
>> > > > please
>> > > > > > let
>> > > > > > > us
>> > > > > > > > > > know
>> > > > > > > > > > how you feel in response to Colin's comments on this
>> topic.
>> > > > > > > > > >
>> > > > > > > > > > > 111. To put this in context, when we had IBP, the
>> default
>> > > > value
>> > > > > > is
>> > > > > > > > the
>> > > > > > > > > > > current released version. So, if you are a brand new
>> > user,
>> > > > you
>> > > > > > > don't
>> > > > > > > > > need
>> > > > > > > > > > > to configure IBP and all new features will be
>> immediately
>> > > > > > available
>> > > > > > > > in
>> > > > > > > > > > the
>> > > > > > > > > > > new cluster. If you are upgrading from an old version,
>> > you
>> > > do
>> > > > > > need
>> > > > > > > to
>> > > > > > > > > > > understand and configure IBP. I see a similar pattern
>> > here
>> > > > for
>> > > > > > > > > > > features. From the ease of use perspective, ideally,
>> we
>> > > > > shouldn't
>> > > > > > > > > require
>> > > > > > > > > > a
>> > > > > > > > > > > new user to have an extra step such as running a
>> > bootstrap
>> > > > > script
>> > > > > > > > > unless
>> > > > > > > > > > > it's truly necessary. If someone has a special need
>> (all
>> > > the
>> > > > > > cases
>> > > > > > > > you
>> > > > > > > > > > > mentioned seem special cases?), they can configure a
>> mode
>> > > > such
>> > > > > > that
>> > > > > > > > > > > features are enabled/disabled manually.
>> > > > > > > > > >
>> > > > > > > > > > (Kowshik): That makes sense, thanks for the idea! Sorry
>> if
>> > I
>> > > > > didn't
>> > > > > > > > > > understand
>> > > > > > > > > > this need earlier. I have updated the KIP with the
>> approach
>> > > > that
>> > > > > > > > whenever
>> > > > > > > > > > the '/features' node is absent, the controller by
>> default
>> > > will
>> > > > > > > > bootstrap
>> > > > > > > > > > the node
>> > > > > > > > > > to contain the latest feature levels. Here is the new
>> > section
>> > > > in
>> > > > > > the
>> > > > > > > > KIP
>> > > > > > > > > > describing
>> > > > > > > > > > the same:
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-Controller:ZKnodebootstrapwithdefaultvalues
>> > > > > > > > > >
>> > > > > > > > > > Next, as I explained in my response to Colin's
>> suggestions,
>> > > we
>> > > > > are
>> > > > > > > now
>> > > > > > > > > > providing a `--finalize-latest-features` flag with the
>> > > tooling.
>> > > > > > This
>> > > > > > > > lets
>> > > > > > > > > > the sysadmin finalize all features known to the
>> controller
>> > to
>> > > > > their
>> > > > > > > > > latest
>> > > > > > > > > > version
>> > > > > > > > > > levels. Please look at this section (point #3 and the
>> > tooling
>> > > > > > example
>> > > > > > > > > > later):
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-Toolingsupport
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > Do you feel this addresses your comment/concern?
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > Cheers,
>> > > > > > > > > > Kowshik
>> > > > > > > > > >
>> > > > > > > > > > On Mon, Apr 6, 2020 at 12:06 PM Jun Rao <
>> jun@confluent.io>
>> > > > > wrote:
>> > > > > > > > > >
>> > > > > > > > > > > Hi, Kowshik,
>> > > > > > > > > > >
>> > > > > > > > > > > Thanks for the reply. A few more replies below.
>> > > > > > > > > > >
>> > > > > > > > > > > 100.6 You can look for the sentence "This operation
>> > > requires
>> > > > > > ALTER
>> > > > > > > on
>> > > > > > > > > > > CLUSTER." in KIP-455. Also, you can check its usage in
>> > > > > > > > > > > KafkaApis.authorize().
>> > > > > > > > > > >
>> > > > > > > > > > > 110. From the external client/tooling perspective,
>> it's
>> > > more
>> > > > > > > natural
>> > > > > > > > to
>> > > > > > > > > > use
>> > > > > > > > > > > the release version for features. If we can use the
>> same
>> > > > > release
>> > > > > > > > > version
>> > > > > > > > > > > for internal representation, it seems simpler (easier
>> to
>> > > > > > > understand,
>> > > > > > > > no
>> > > > > > > > > > > mapping overhead, etc). Is there a benefit with
>> separate
>> > > > > external
>> > > > > > > and
>> > > > > > > > > > > internal versioning schemes?
>> > > > > > > > > > >
>> > > > > > > > > > > 111. To put this in context, when we had IBP, the
>> default
>> > > > value
>> > > > > > is
>> > > > > > > > the
>> > > > > > > > > > > current released version. So, if you are a brand new
>> > user,
>> > > > you
>> > > > > > > don't
>> > > > > > > > > need
>> > > > > > > > > > > to configure IBP and all new features will be
>> immediately
>> > > > > > available
>> > > > > > > > in
>> > > > > > > > > > the
>> > > > > > > > > > > new cluster. If you are upgrading from an old version,
>> > you
>> > > do
>> > > > > > need
>> > > > > > > to
>> > > > > > > > > > > understand and configure IBP. I see a similar pattern
>> > here
>> > > > for
>> > > > > > > > > > > features. From the ease of use perspective, ideally,
>> we
>> > > > > shouldn't
>> > > > > > > > > > require a
>> > > > > > > > > > > new user to have an extra step such as running a
>> > bootstrap
>> > > > > script
>> > > > > > > > > unless
>> > > > > > > > > > > it's truly necessary. If someone has a special need
>> (all
>> > > the
>> > > > > > cases
>> > > > > > > > you
>> > > > > > > > > > > mentioned seem special cases?), they can configure a
>> mode
>> > > > such
>> > > > > > that
>> > > > > > > > > > > features are enabled/disabled manually.
>> > > > > > > > > > >
>> > > > > > > > > > > Jun
>> > > > > > > > > > >
>> > > > > > > > > > > On Fri, Apr 3, 2020 at 5:54 PM Kowshik Prakasam <
>> > > > > > > > > kprakasam@confluent.io>
>> > > > > > > > > > > wrote:
>> > > > > > > > > > >
>> > > > > > > > > > > > Hi Jun,
>> > > > > > > > > > > >
>> > > > > > > > > > > > Thanks for the feedback and suggestions. Please
>> find my
>> > > > > > response
>> > > > > > > > > below.
>> > > > > > > > > > > >
>> > > > > > > > > > > > > 100.6 For every new request, the admin needs to
>> > control
>> > > > who
>> > > > > > is
>> > > > > > > > > > allowed
>> > > > > > > > > > > to
>> > > > > > > > > > > > > issue that request if security is enabled. So, we
>> > need
>> > > to
>> > > > > > > assign
>> > > > > > > > > the
>> > > > > > > > > > > new
>> > > > > > > > > > > > > request a ResourceType and possible AclOperations.
>> > See
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-455%3A+Create+an+Administrative+API+for+Replica+Reassignment
>> > > > > > > > > > > > > as an example.
>> > > > > > > > > > > >
>> > > > > > > > > > > > (Kowshik): I don't see any reference to the words
>> > > > > ResourceType
>> > > > > > or
>> > > > > > > > > > > > AclOperations
>> > > > > > > > > > > > in the KIP. Please let me know how I can use the KIP
>> > that
>> > > > you
>> > > > > > > > linked
>> > > > > > > > > to
>> > > > > > > > > > > > know how to
>> > > > > > > > > > > > setup the appropriate ResourceType and/or
>> > > ClusterOperation?
>> > > > > > > > > > > >
>> > > > > > > > > > > > > 105. If we change delete to disable, it's better
>> to
>> > do
>> > > > this
>> > > > > > > > > > > consistently
>> > > > > > > > > > > > in
>> > > > > > > > > > > > > request protocol and admin api as well.
>> > > > > > > > > > > >
>> > > > > > > > > > > > (Kowshik): The API shouldn't be called 'disable'
>> when
>> > it
>> > > is
>> > > > > > > > deleting
>> > > > > > > > > a
>> > > > > > > > > > > > feature.
>> > > > > > > > > > > > I've just changed the KIP to use 'delete'. I don't
>> > have a
>> > > > > > strong
>> > > > > > > > > > > > preference.
>> > > > > > > > > > > >
>> > > > > > > > > > > > > 110. The minVersion/maxVersion for features use
>> > int64.
>> > > > > > > Currently,
>> > > > > > > > > our
>> > > > > > > > > > > > > release version schema is major.minor.bugfix (e.g.
>> > > > 2.5.0).
>> > > > > > It's
>> > > > > > > > > > > possible
>> > > > > > > > > > > > > for new features to be included in minor releases
>> > too.
>> > > > > Should
>> > > > > > > we
>> > > > > > > > > make
>> > > > > > > > > > > the
>> > > > > > > > > > > > > feature versioning match the release versioning?
>> > > > > > > > > > > >
>> > > > > > > > > > > > (Kowshik): The release version can be mapped to a
>> set
>> > of
>> > > > > > feature
>> > > > > > > > > > > versions,
>> > > > > > > > > > > > and this can be done, for example in the tool (or
>> even
>> > > > > external
>> > > > > > > to
>> > > > > > > > > the
>> > > > > > > > > > > > tool).
>> > > > > > > > > > > > Can you please clarify what I'm missing?
>> > > > > > > > > > > >
>> > > > > > > > > > > > > 111. "During regular operations, the data in the
>> ZK
>> > > node
>> > > > > can
>> > > > > > be
>> > > > > > > > > > mutated
>> > > > > > > > > > > > > only via a specific admin API served only by the
>> > > > > > controller." I
>> > > > > > > > am
>> > > > > > > > > > > > > wondering why can't the controller auto finalize a
>> > > > feature
>> > > > > > > > version
>> > > > > > > > > > > after
>> > > > > > > > > > > > > all brokers are upgraded? For new users who
>> download
>> > > the
>> > > > > > latest
>> > > > > > > > > > version
>> > > > > > > > > > > > to
>> > > > > > > > > > > > > build a new cluster, it's inconvenient for them to
>> > have
>> > > > to
>> > > > > > > > manually
>> > > > > > > > > > > > enable
>> > > > > > > > > > > > > each feature.
>> > > > > > > > > > > >
>> > > > > > > > > > > > (Kowshik): I agree that there is a trade-off here,
>> but
>> > it
>> > > > > will
>> > > > > > > help
>> > > > > > > > > > > > to decide whether the automation can be thought
>> through
>> > > in
>> > > > > the
>> > > > > > > > future
>> > > > > > > > > > > > in a follow up KIP, or right now in this KIP. We may
>> > > invest
>> > > > > > > > > > > > in automation, but we have to decide whether we
>> should
>> > do
>> > > > it
>> > > > > > > > > > > > now or later.
>> > > > > > > > > > > >
>> > > > > > > > > > > > For the inconvenience that you mentioned, do you
>> think
>> > > the
>> > > > > > > problem
>> > > > > > > > > that
>> > > > > > > > > > > you
>> > > > > > > > > > > > mentioned can be  overcome by asking for the cluster
>> > > > operator
>> > > > > > to
>> > > > > > > > run
>> > > > > > > > > a
>> > > > > > > > > > > > bootstrap script  when he/she knows that a specific
>> AK
>> > > > > release
>> > > > > > > has
>> > > > > > > > > been
>> > > > > > > > > > > > almost completely deployed in a cluster for the
>> first
>> > > time?
>> > > > > > Idea
>> > > > > > > is
>> > > > > > > > > > that
>> > > > > > > > > > > > the
>> > > > > > > > > > > > bootstrap script will know how to map a specific AK
>> > > release
>> > > > > to
>> > > > > > > > > > finalized
>> > > > > > > > > > > > feature versions, and run the `kafka-features.sh`
>> tool
>> > > > > > > > appropriately
>> > > > > > > > > > > > against
>> > > > > > > > > > > > the cluster.
>> > > > > > > > > > > >
>> > > > > > > > > > > > Now, coming back to your automation
>> proposal/question.
>> > > > > > > > > > > > I do see the value of automated feature version
>> > > > finalization,
>> > > > > > > but I
>> > > > > > > > > > also
>> > > > > > > > > > > > see
>> > > > > > > > > > > > that this will open up several questions and some
>> > risks,
>> > > as
>> > > > > > > > explained
>> > > > > > > > > > > > below.
>> > > > > > > > > > > > The answers to these depend on the definition of the
>> > > > > automation
>> > > > > > > we
>> > > > > > > > > > choose
>> > > > > > > > > > > > to build, and how well does it fit into a kafka
>> > > deployment.
>> > > > > > > > > > > > Basically, it can be unsafe for the controller to
>> > > finalize
>> > > > > > > feature
>> > > > > > > > > > > version
>> > > > > > > > > > > > upgrades automatically, without learning about the
>> > intent
>> > > > of
>> > > > > > the
>> > > > > > > > > > cluster
>> > > > > > > > > > > > operator.
>> > > > > > > > > > > > 1. We would sometimes want to lock feature versions
>> > only
>> > > > when
>> > > > > > we
>> > > > > > > > have
>> > > > > > > > > > > > externally verified
>> > > > > > > > > > > > the stability of the broker binary.
>> > > > > > > > > > > > 2. Sometimes only the cluster operator knows that a
>> > > cluster
>> > > > > > > upgrade
>> > > > > > > > > is
>> > > > > > > > > > > > complete,
>> > > > > > > > > > > > and new brokers are highly unlikely to join the
>> > cluster.
>> > > > > > > > > > > > 3. Only the cluster operator knows that the intent
>> is
>> > to
>> > > > > deploy
>> > > > > > > the
>> > > > > > > > > > same
>> > > > > > > > > > > > version
>> > > > > > > > > > > > of the new broker release across the entire cluster
>> > (i.e.
>> > > > the
>> > > > > > > > latest
>> > > > > > > > > > > > downloaded version).
>> > > > > > > > > > > > 4. For downgrades, it appears the controller still
>> > needs
>> > > > some
>> > > > > > > > > external
>> > > > > > > > > > > > input
>> > > > > > > > > > > > (such as the proposed tool) to finalize a feature
>> > version
>> > > > > > > > downgrade.
>> > > > > > > > > > > >
>> > > > > > > > > > > > If we have automation, that automation can end up
>> > failing
>> > > > in
>> > > > > > some
>> > > > > > > > of
>> > > > > > > > > > the
>> > > > > > > > > > > > cases
>> > > > > > > > > > > > above. Then, we need a way to declare that the
>> cluster
>> > is
>> > > > > "not
>> > > > > > > > ready"
>> > > > > > > > > > if
>> > > > > > > > > > > > the
>> > > > > > > > > > > > controller cannot automatically finalize some basic
>> > > > required
>> > > > > > > > feature
>> > > > > > > > > > > > version
>> > > > > > > > > > > > upgrades across the cluster. We need to make the
>> > cluster
>> > > > > > operator
>> > > > > > > > > aware
>> > > > > > > > > > > in
>> > > > > > > > > > > > such a scenario (raise an alert or alike).
>> > > > > > > > > > > >
>> > > > > > > > > > > > > 112. DeleteFeaturesResponse: It seems the apiKey
>> > should
>> > > > be
>> > > > > 49
>> > > > > > > > > instead
>> > > > > > > > > > > of
>> > > > > > > > > > > > 48.
>> > > > > > > > > > > >
>> > > > > > > > > > > > (Kowshik): Done.
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > Cheers,
>> > > > > > > > > > > > Kowshik
>> > > > > > > > > > > >
>> > > > > > > > > > > > On Fri, Apr 3, 2020 at 11:24 AM Jun Rao <
>> > > jun@confluent.io>
>> > > > > > > wrote:
>> > > > > > > > > > > >
>> > > > > > > > > > > > > Hi, Kowshik,
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > Thanks for the reply. A few more comments below.
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > 100.6 For every new request, the admin needs to
>> > control
>> > > > who
>> > > > > > is
>> > > > > > > > > > allowed
>> > > > > > > > > > > to
>> > > > > > > > > > > > > issue that request if security is enabled. So, we
>> > need
>> > > to
>> > > > > > > assign
>> > > > > > > > > the
>> > > > > > > > > > > new
>> > > > > > > > > > > > > request a ResourceType and possible AclOperations.
>> > See
>> > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-455%3A+Create+an+Administrative+API+for+Replica+Reassignment
>> > > > > > > > > > > > > as
>> > > > > > > > > > > > > an example.
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > 105. If we change delete to disable, it's better
>> to
>> > do
>> > > > this
>> > > > > > > > > > > consistently
>> > > > > > > > > > > > in
>> > > > > > > > > > > > > request protocol and admin api as well.
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > 110. The minVersion/maxVersion for features use
>> > int64.
>> > > > > > > Currently,
>> > > > > > > > > our
>> > > > > > > > > > > > > release version schema is major.minor.bugfix (e.g.
>> > > > 2.5.0).
>> > > > > > It's
>> > > > > > > > > > > possible
>> > > > > > > > > > > > > for new features to be included in minor releases
>> > too.
>> > > > > Should
>> > > > > > > we
>> > > > > > > > > make
>> > > > > > > > > > > the
>> > > > > > > > > > > > > feature versioning match the release versioning?
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > 111. "During regular operations, the data in the
>> ZK
>> > > node
>> > > > > can
>> > > > > > be
>> > > > > > > > > > mutated
>> > > > > > > > > > > > > only via a specific admin API served only by the
>> > > > > > controller." I
>> > > > > > > > am
>> > > > > > > > > > > > > wondering why can't the controller auto finalize a
>> > > > feature
>> > > > > > > > version
>> > > > > > > > > > > after
>> > > > > > > > > > > > > all brokers are upgraded? For new users who
>> download
>> > > the
>> > > > > > latest
>> > > > > > > > > > version
>> > > > > > > > > > > > to
>> > > > > > > > > > > > > build a new cluster, it's inconvenient for them to
>> > have
>> > > > to
>> > > > > > > > manually
>> > > > > > > > > > > > enable
>> > > > > > > > > > > > > each feature.
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > 112. DeleteFeaturesResponse: It seems the apiKey
>> > should
>> > > > be
>> > > > > 49
>> > > > > > > > > instead
>> > > > > > > > > > > of
>> > > > > > > > > > > > > 48.
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > Jun
>> > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > On Fri, Apr 3, 2020 at 1:27 AM Kowshik Prakasam <
>> > > > > > > > > > > kprakasam@confluent.io>
>> > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > > Hey Jun,
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > Thanks a lot for the great feedback! Please note
>> > that
>> > > > the
>> > > > > > > > design
>> > > > > > > > > > > > > > has changed a little bit on the KIP, and we now
>> > > > propagate
>> > > > > > the
>> > > > > > > > > > > finalized
>> > > > > > > > > > > > > > features metadata only via ZK watches (instead
>> of
>> > > > > > > > > > > UpdateMetadataRequest
>> > > > > > > > > > > > > > from the controller).
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > Please find below my response to your
>> > > > questions/feedback,
>> > > > > > > with
>> > > > > > > > > the
>> > > > > > > > > > > > prefix
>> > > > > > > > > > > > > > "(Kowshik):".
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > 100.
>> UpdateFeaturesRequest/UpdateFeaturesResponse
>> > > > > > > > > > > > > > > 100.1 Since this request waits for responses
>> from
>> > > > > > brokers,
>> > > > > > > > > should
>> > > > > > > > > > > we
>> > > > > > > > > > > > > add
>> > > > > > > > > > > > > > a
>> > > > > > > > > > > > > > > timeout in the request (like
>> createTopicRequest)?
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > (Kowshik): Great point! Done. I have added a
>> > timeout
>> > > > > field.
>> > > > > > > > Note:
>> > > > > > > > > > we
>> > > > > > > > > > > no
>> > > > > > > > > > > > > > longer
>> > > > > > > > > > > > > > wait for responses from brokers, since the
>> design
>> > has
>> > > > > been
>> > > > > > > > > changed
>> > > > > > > > > > so
>> > > > > > > > > > > > > that
>> > > > > > > > > > > > > > the
>> > > > > > > > > > > > > > features information is propagated via ZK.
>> > > > Nevertheless,
>> > > > > it
>> > > > > > > is
>> > > > > > > > > > right
>> > > > > > > > > > > to
>> > > > > > > > > > > > > > have a timeout
>> > > > > > > > > > > > > > for the request.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > 100.2 The response schema is a bit weird.
>> > > Typically,
>> > > > > the
>> > > > > > > > > response
>> > > > > > > > > > > > just
>> > > > > > > > > > > > > > > shows an error code and an error message,
>> instead
>> > > of
>> > > > > > > echoing
>> > > > > > > > > the
>> > > > > > > > > > > > > request.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > (Kowshik): Great point! Yeah, I have modified
>> it to
>> > > > just
>> > > > > > > return
>> > > > > > > > > an
>> > > > > > > > > > > > error
>> > > > > > > > > > > > > > code and a message.
>> > > > > > > > > > > > > > Previously it was not echoing the "request",
>> rather
>> > > it
>> > > > > was
>> > > > > > > > > > returning
>> > > > > > > > > > > > the
>> > > > > > > > > > > > > > latest set of
>> > > > > > > > > > > > > > cluster-wide finalized features (after applying
>> the
>> > > > > > updates).
>> > > > > > > > But
>> > > > > > > > > > you
>> > > > > > > > > > > > are
>> > > > > > > > > > > > > > right,
>> > > > > > > > > > > > > > the additional info is not required, so I have
>> > > removed
>> > > > it
>> > > > > > > from
>> > > > > > > > > the
>> > > > > > > > > > > > > response
>> > > > > > > > > > > > > > schema.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > 100.3 Should we add a separate request to
>> > > > list/describe
>> > > > > > the
>> > > > > > > > > > > existing
>> > > > > > > > > > > > > > > features?
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > (Kowshik): This is already present in the KIP
>> via
>> > the
>> > > > > > > > > > > > 'DescribeFeatures'
>> > > > > > > > > > > > > > Admin API,
>> > > > > > > > > > > > > > which, underneath covers uses the
>> > ApiVersionsRequest
>> > > to
>> > > > > > > > > > list/describe
>> > > > > > > > > > > > the
>> > > > > > > > > > > > > > existing features. Please read the 'Tooling
>> > support'
>> > > > > > section.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > 100.4 We are mixing ADD_OR_UPDATE and DELETE
>> in a
>> > > > > single
>> > > > > > > > > request.
>> > > > > > > > > > > For
>> > > > > > > > > > > > > > > DELETE, the version field doesn't make sense.
>> > So, I
>> > > > > guess
>> > > > > > > the
>> > > > > > > > > > > broker
>> > > > > > > > > > > > > just
>> > > > > > > > > > > > > > > ignores this? An alternative way is to have a
>> > > > separate
>> > > > > > > > > > > > > > DeleteFeaturesRequest
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > (Kowshik): Great point! I have modified the KIP
>> now
>> > > to
>> > > > > > have 2
>> > > > > > > > > > > separate
>> > > > > > > > > > > > > > controller APIs
>> > > > > > > > > > > > > > serving these different purposes:
>> > > > > > > > > > > > > > 1. updateFeatures
>> > > > > > > > > > > > > > 2. deleteFeatures
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > 100.5 In UpdateFeaturesResponse, we have "The
>> > > > > > monotonically
>> > > > > > > > > > > > increasing
>> > > > > > > > > > > > > > > version of the metadata for finalized
>> features."
>> > I
>> > > am
>> > > > > > > > wondering
>> > > > > > > > > > why
>> > > > > > > > > > > > the
>> > > > > > > > > > > > > > > ordering is important?
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > (Kowshik): In the latest KIP write-up, it is
>> called
>> > > > epoch
>> > > > > > > > > (instead
>> > > > > > > > > > of
>> > > > > > > > > > > > > > version), and
>> > > > > > > > > > > > > > it is just the ZK node version. Basically, this
>> is
>> > > the
>> > > > > > epoch
>> > > > > > > > for
>> > > > > > > > > > the
>> > > > > > > > > > > > > > cluster-wide
>> > > > > > > > > > > > > > finalized feature version metadata. This
>> metadata
>> > is
>> > > > > served
>> > > > > > > to
>> > > > > > > > > > > clients
>> > > > > > > > > > > > > via
>> > > > > > > > > > > > > > the
>> > > > > > > > > > > > > > ApiVersionsResponse (for reads). We propagate
>> > updates
>> > > > > from
>> > > > > > > the
>> > > > > > > > > > > > > '/features'
>> > > > > > > > > > > > > > ZK node
>> > > > > > > > > > > > > > to all brokers, via ZK watches setup by each
>> broker
>> > > on
>> > > > > the
>> > > > > > > > > > > '/features'
>> > > > > > > > > > > > > > node.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > Now here is why the ordering is important:
>> > > > > > > > > > > > > > ZK watches don't propagate at the same time. As
>> a
>> > > > result,
>> > > > > > the
>> > > > > > > > > > > > > > ApiVersionsResponse
>> > > > > > > > > > > > > > is eventually consistent across brokers. This
>> can
>> > > > > introduce
>> > > > > > > > cases
>> > > > > > > > > > > > > > where clients see an older lower epoch of the
>> > > features
>> > > > > > > > metadata,
>> > > > > > > > > > > after
>> > > > > > > > > > > > a
>> > > > > > > > > > > > > > more recent
>> > > > > > > > > > > > > > higher epoch was returned at a previous point in
>> > > time.
>> > > > We
>> > > > > > > > expect
>> > > > > > > > > > > > clients
>> > > > > > > > > > > > > > to always employ the rule that the latest
>> received
>> > > > higher
>> > > > > > > epoch
>> > > > > > > > > of
>> > > > > > > > > > > > > metadata
>> > > > > > > > > > > > > > always trumps an older smaller epoch. Those
>> clients
>> > > > that
>> > > > > > are
>> > > > > > > > > > external
>> > > > > > > > > > > > to
>> > > > > > > > > > > > > > Kafka should strongly consider discovering the
>> > latest
>> > > > > > > metadata
>> > > > > > > > > once
>> > > > > > > > > > > > > during
>> > > > > > > > > > > > > > startup from the brokers, and if required
>> refresh
>> > the
>> > > > > > > metadata
>> > > > > > > > > > > > > periodically
>> > > > > > > > > > > > > > (to get the latest metadata).
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > 100.6 Could you specify the required ACL for
>> this
>> > > new
>> > > > > > > > request?
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > (Kowshik): What is ACL, and how could I find out
>> > > which
>> > > > > one
>> > > > > > to
>> > > > > > > > > > > specify?
>> > > > > > > > > > > > > > Please could you provide me some pointers? I'll
>> be
>> > > glad
>> > > > > to
>> > > > > > > > update
>> > > > > > > > > > the
>> > > > > > > > > > > > > > KIP once I know the next steps.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > 101. For the broker registration ZK node,
>> should
>> > we
>> > > > > bump
>> > > > > > up
>> > > > > > > > the
>> > > > > > > > > > > > version
>> > > > > > > > > > > > > > in
>> > > > > > > > > > > > > > the json?
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > (Kowshik): Great point! Done. I've increased the
>> > > > version
>> > > > > in
>> > > > > > > the
>> > > > > > > > > > > broker
>> > > > > > > > > > > > > json
>> > > > > > > > > > > > > > by 1.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > 102. For the /features ZK node, not sure if we
>> > need
>> > > > the
>> > > > > > > epoch
>> > > > > > > > > > > field.
>> > > > > > > > > > > > > Each
>> > > > > > > > > > > > > > > ZK node has an internal version field that is
>> > > > > incremented
>> > > > > > > on
>> > > > > > > > > > every
>> > > > > > > > > > > > > > update.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > (Kowshik): Great point! Done. I'm using the ZK
>> node
>> > > > > version
>> > > > > > > > now,
>> > > > > > > > > > > > instead
>> > > > > > > > > > > > > of
>> > > > > > > > > > > > > > explicitly
>> > > > > > > > > > > > > > incremented epoch.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > 103. "Enabling the actual semantics of a
>> feature
>> > > > > version
>> > > > > > > > > > > cluster-wide
>> > > > > > > > > > > > > is
>> > > > > > > > > > > > > > > left to the discretion of the logic
>> implementing
>> > > the
>> > > > > > > feature
>> > > > > > > > > (ex:
>> > > > > > > > > > > can
>> > > > > > > > > > > > > be
>> > > > > > > > > > > > > > > done via dynamic broker config)." Does that
>> mean
>> > > the
>> > > > > > broker
>> > > > > > > > > > > > > registration
>> > > > > > > > > > > > > > ZK
>> > > > > > > > > > > > > > > node will be updated dynamically when this
>> > happens?
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > (Kowshik): Not really. The text was just
>> conveying
>> > > > that a
>> > > > > > > > broker
>> > > > > > > > > > > could
>> > > > > > > > > > > > > > "know" of
>> > > > > > > > > > > > > > a new feature version, but it does not mean the
>> > > broker
>> > > > > > should
>> > > > > > > > > have
>> > > > > > > > > > > also
>> > > > > > > > > > > > > > activated the effects of the feature version.
>> > Knowing
>> > > > vs
>> > > > > > > > > activation
>> > > > > > > > > > > > are 2
>> > > > > > > > > > > > > > separate things,
>> > > > > > > > > > > > > > and the latter can be achieved by dynamic
>> config. I
>> > > > have
>> > > > > > > > reworded
>> > > > > > > > > > the
>> > > > > > > > > > > > > text
>> > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > make this clear to the reader.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > 104. UpdateMetadataRequest
>> > > > > > > > > > > > > > > 104.1 It would be useful to describe when the
>> > > feature
>> > > > > > > > metadata
>> > > > > > > > > is
>> > > > > > > > > > > > > > included
>> > > > > > > > > > > > > > > in the request. My understanding is that it's
>> > only
>> > > > > > included
>> > > > > > > > if
>> > > > > > > > > > (1)
>> > > > > > > > > > > > > there
>> > > > > > > > > > > > > > is
>> > > > > > > > > > > > > > > a change to the finalized feature; (2) broker
>> > > > restart;
>> > > > > > (3)
>> > > > > > > > > > > controller
>> > > > > > > > > > > > > > > failover.
>> > > > > > > > > > > > > > > 104.2 The new fields have the following
>> versions.
>> > > Why
>> > > > > are
>> > > > > > > the
>> > > > > > > > > > > > versions
>> > > > > > > > > > > > > 3+
>> > > > > > > > > > > > > > > when the top version is bumped to 6?
>> > > > > > > > > > > > > > >       "fields":  [
>> > > > > > > > > > > > > > >         {"name": "Name", "type":  "string",
>> > > > "versions":
>> > > > > > > > "3+",
>> > > > > > > > > > > > > > >           "about": "The name of the
>> feature."},
>> > > > > > > > > > > > > > >         {"name":  "Version", "type":  "int64",
>> > > > > > "versions":
>> > > > > > > > > "3+",
>> > > > > > > > > > > > > > >           "about": "The finalized version for
>> the
>> > > > > > > feature."}
>> > > > > > > > > > > > > > >       ]
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > (Kowshik): With the new improved design, we have
>> > > > > completely
>> > > > > > > > > > > eliminated
>> > > > > > > > > > > > > the
>> > > > > > > > > > > > > > need to
>> > > > > > > > > > > > > > use UpdateMetadataRequest. This is because we
>> now
>> > > rely
>> > > > on
>> > > > > > ZK
>> > > > > > > to
>> > > > > > > > > > > deliver
>> > > > > > > > > > > > > the
>> > > > > > > > > > > > > > notifications for changes to the '/features' ZK
>> > node.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > 105. kafka-features.sh: Instead of using
>> > > > update/delete,
>> > > > > > > > perhaps
>> > > > > > > > > > > it's
>> > > > > > > > > > > > > > better
>> > > > > > > > > > > > > > > to use enable/disable?
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > (Kowshik): For delete, yes, I have changed it so
>> > that
>> > > > we
>> > > > > > > > instead
>> > > > > > > > > > call
>> > > > > > > > > > > > it
>> > > > > > > > > > > > > > 'disable'.
>> > > > > > > > > > > > > > However for 'update', it can now also refer to
>> > either
>> > > > an
>> > > > > > > > upgrade
>> > > > > > > > > > or a
>> > > > > > > > > > > > > > forced downgrade.
>> > > > > > > > > > > > > > Therefore, I have left it the way it is, just
>> > calling
>> > > > it
>> > > > > as
>> > > > > > > > just
>> > > > > > > > > > > > > 'update'.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > Cheers,
>> > > > > > > > > > > > > > Kowshik
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > On Tue, Mar 31, 2020 at 6:51 PM Jun Rao <
>> > > > > jun@confluent.io>
>> > > > > > > > > wrote:
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Hi, Kowshik,
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Thanks for the KIP. Looks good overall. A few
>> > > > comments
>> > > > > > > below.
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > 100.
>> UpdateFeaturesRequest/UpdateFeaturesResponse
>> > > > > > > > > > > > > > > 100.1 Since this request waits for responses
>> from
>> > > > > > brokers,
>> > > > > > > > > should
>> > > > > > > > > > > we
>> > > > > > > > > > > > > add
>> > > > > > > > > > > > > > a
>> > > > > > > > > > > > > > > timeout in the request (like
>> createTopicRequest)?
>> > > > > > > > > > > > > > > 100.2 The response schema is a bit weird.
>> > > Typically,
>> > > > > the
>> > > > > > > > > response
>> > > > > > > > > > > > just
>> > > > > > > > > > > > > > > shows an error code and an error message,
>> instead
>> > > of
>> > > > > > > echoing
>> > > > > > > > > the
>> > > > > > > > > > > > > request.
>> > > > > > > > > > > > > > > 100.3 Should we add a separate request to
>> > > > list/describe
>> > > > > > the
>> > > > > > > > > > > existing
>> > > > > > > > > > > > > > > features?
>> > > > > > > > > > > > > > > 100.4 We are mixing ADD_OR_UPDATE and DELETE
>> in a
>> > > > > single
>> > > > > > > > > request.
>> > > > > > > > > > > For
>> > > > > > > > > > > > > > > DELETE, the version field doesn't make sense.
>> > So, I
>> > > > > guess
>> > > > > > > the
>> > > > > > > > > > > broker
>> > > > > > > > > > > > > just
>> > > > > > > > > > > > > > > ignores this? An alternative way is to have a
>> > > > separate
>> > > > > > > > > > > > > > > DeleteFeaturesRequest
>> > > > > > > > > > > > > > > 100.5 In UpdateFeaturesResponse, we have "The
>> > > > > > monotonically
>> > > > > > > > > > > > increasing
>> > > > > > > > > > > > > > > version of the metadata for finalized
>> features."
>> > I
>> > > am
>> > > > > > > > wondering
>> > > > > > > > > > why
>> > > > > > > > > > > > the
>> > > > > > > > > > > > > > > ordering is important?
>> > > > > > > > > > > > > > > 100.6 Could you specify the required ACL for
>> this
>> > > new
>> > > > > > > > request?
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > 101. For the broker registration ZK node,
>> should
>> > we
>> > > > > bump
>> > > > > > up
>> > > > > > > > the
>> > > > > > > > > > > > version
>> > > > > > > > > > > > > > in
>> > > > > > > > > > > > > > > the json?
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > 102. For the /features ZK node, not sure if we
>> > need
>> > > > the
>> > > > > > > epoch
>> > > > > > > > > > > field.
>> > > > > > > > > > > > > Each
>> > > > > > > > > > > > > > > ZK node has an internal version field that is
>> > > > > incremented
>> > > > > > > on
>> > > > > > > > > > every
>> > > > > > > > > > > > > > update.
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > 103. "Enabling the actual semantics of a
>> feature
>> > > > > version
>> > > > > > > > > > > cluster-wide
>> > > > > > > > > > > > > is
>> > > > > > > > > > > > > > > left to the discretion of the logic
>> implementing
>> > > the
>> > > > > > > feature
>> > > > > > > > > (ex:
>> > > > > > > > > > > can
>> > > > > > > > > > > > > be
>> > > > > > > > > > > > > > > done via dynamic broker config)." Does that
>> mean
>> > > the
>> > > > > > broker
>> > > > > > > > > > > > > registration
>> > > > > > > > > > > > > > ZK
>> > > > > > > > > > > > > > > node will be updated dynamically when this
>> > happens?
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > 104. UpdateMetadataRequest
>> > > > > > > > > > > > > > > 104.1 It would be useful to describe when the
>> > > feature
>> > > > > > > > metadata
>> > > > > > > > > is
>> > > > > > > > > > > > > > included
>> > > > > > > > > > > > > > > in the request. My understanding is that it's
>> > only
>> > > > > > included
>> > > > > > > > if
>> > > > > > > > > > (1)
>> > > > > > > > > > > > > there
>> > > > > > > > > > > > > > is
>> > > > > > > > > > > > > > > a change to the finalized feature; (2) broker
>> > > > restart;
>> > > > > > (3)
>> > > > > > > > > > > controller
>> > > > > > > > > > > > > > > failover.
>> > > > > > > > > > > > > > > 104.2 The new fields have the following
>> versions.
>> > > Why
>> > > > > are
>> > > > > > > the
>> > > > > > > > > > > > versions
>> > > > > > > > > > > > > 3+
>> > > > > > > > > > > > > > > when the top version is bumped to 6?
>> > > > > > > > > > > > > > >       "fields":  [
>> > > > > > > > > > > > > > >         {"name": "Name", "type":  "string",
>> > > > "versions":
>> > > > > > > > "3+",
>> > > > > > > > > > > > > > >           "about": "The name of the
>> feature."},
>> > > > > > > > > > > > > > >         {"name":  "Version", "type":  "int64",
>> > > > > > "versions":
>> > > > > > > > > "3+",
>> > > > > > > > > > > > > > >           "about": "The finalized version for
>> the
>> > > > > > > feature."}
>> > > > > > > > > > > > > > >       ]
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > 105. kafka-features.sh: Instead of using
>> > > > update/delete,
>> > > > > > > > perhaps
>> > > > > > > > > > > it's
>> > > > > > > > > > > > > > better
>> > > > > > > > > > > > > > > to use enable/disable?
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Jun
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > On Tue, Mar 31, 2020 at 5:29 PM Kowshik
>> Prakasam
>> > <
>> > > > > > > > > > > > > kprakasam@confluent.io
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > Hey Boyang,
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > Thanks for the great feedback! I have
>> updated
>> > the
>> > > > KIP
>> > > > > > > based
>> > > > > > > > > on
>> > > > > > > > > > > your
>> > > > > > > > > > > > > > > > feedback.
>> > > > > > > > > > > > > > > > Please find my response below for your
>> > comments,
>> > > > look
>> > > > > > for
>> > > > > > > > > > > sentences
>> > > > > > > > > > > > > > > > starting
>> > > > > > > > > > > > > > > > with "(Kowshik)" below.
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > 1. "When is it safe for the brokers to
>> begin
>> > > > > handling
>> > > > > > > EOS
>> > > > > > > > > > > > traffic"
>> > > > > > > > > > > > > > > could
>> > > > > > > > > > > > > > > > be
>> > > > > > > > > > > > > > > > > converted as "When is it safe for the
>> brokers
>> > > to
>> > > > > > start
>> > > > > > > > > > serving
>> > > > > > > > > > > > new
>> > > > > > > > > > > > > > > > > Exactly-Once(EOS) semantics" since EOS is
>> not
>> > > > > > explained
>> > > > > > > > > > earlier
>> > > > > > > > > > > > in
>> > > > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > > context.
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > (Kowshik): Great point! Done.
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > 2. In the *Explanation *section, the
>> metadata
>> > > > > version
>> > > > > > > > > number
>> > > > > > > > > > > part
>> > > > > > > > > > > > > > > seems a
>> > > > > > > > > > > > > > > > > bit blurred. Could you point a reference
>> to
>> > > later
>> > > > > > > section
>> > > > > > > > > > that
>> > > > > > > > > > > we
>> > > > > > > > > > > > > > going
>> > > > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > store it in Zookeeper and update it every
>> > time
>> > > > when
>> > > > > > > there
>> > > > > > > > > is
>> > > > > > > > > > a
>> > > > > > > > > > > > > > feature
>> > > > > > > > > > > > > > > > > change?
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > (Kowshik): Great point! Done. I've added a
>> > > > reference
>> > > > > in
>> > > > > > > the
>> > > > > > > > > > KIP.
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > 3. For the feature downgrade, although
>> it's a
>> > > > > > Non-goal
>> > > > > > > of
>> > > > > > > > > the
>> > > > > > > > > > > > KIP,
>> > > > > > > > > > > > > > for
>> > > > > > > > > > > > > > > > > features such as group coordinator
>> semantics,
>> > > > there
>> > > > > > is
>> > > > > > > no
>> > > > > > > > > > legal
>> > > > > > > > > > > > > > > scenario
>> > > > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > perform a downgrade at all. So having
>> > downgrade
>> > > > > door
>> > > > > > > open
>> > > > > > > > > is
>> > > > > > > > > > > > pretty
>> > > > > > > > > > > > > > > > > error-prone as human faults happen all the
>> > > time.
>> > > > > I'm
>> > > > > > > > > assuming
>> > > > > > > > > > > as
>> > > > > > > > > > > > > new
>> > > > > > > > > > > > > > > > > features are implemented, it's not very
>> hard
>> > to
>> > > > > add a
>> > > > > > > > flag
>> > > > > > > > > > > during
>> > > > > > > > > > > > > > > feature
>> > > > > > > > > > > > > > > > > creation to indicate whether this feature
>> is
>> > > > > > > > > "downgradable".
>> > > > > > > > > > > > Could
>> > > > > > > > > > > > > > you
>> > > > > > > > > > > > > > > > > explain a bit more on the extra
>> engineering
>> > > > effort
>> > > > > > for
>> > > > > > > > > > shipping
>> > > > > > > > > > > > > this
>> > > > > > > > > > > > > > > KIP
>> > > > > > > > > > > > > > > > > with downgrade protection in place?
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > (Kowshik): Great point! I'd agree and
>> disagree
>> > > > here.
>> > > > > > > While
>> > > > > > > > I
>> > > > > > > > > > > agree
>> > > > > > > > > > > > > that
>> > > > > > > > > > > > > > > > accidental
>> > > > > > > > > > > > > > > > downgrades can cause problems, I also think
>> > > > sometimes
>> > > > > > > > > > downgrades
>> > > > > > > > > > > > > should
>> > > > > > > > > > > > > > > > be allowed for emergency reasons (not all
>> > > > downgrades
>> > > > > > > cause
>> > > > > > > > > > > issues).
>> > > > > > > > > > > > > > > > It is just subjective to the feature being
>> > > > > downgraded.
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > To be more strict about feature version
>> > > > downgrades, I
>> > > > > > > have
>> > > > > > > > > > > modified
>> > > > > > > > > > > > > the
>> > > > > > > > > > > > > > > KIP
>> > > > > > > > > > > > > > > > proposing that we mandate a
>> `--force-downgrade`
>> > > > flag
>> > > > > be
>> > > > > > > > used
>> > > > > > > > > in
>> > > > > > > > > > > the
>> > > > > > > > > > > > > > > > UPDATE_FEATURES api
>> > > > > > > > > > > > > > > > and the tooling, whenever the human is
>> > > downgrading
>> > > > a
>> > > > > > > > > finalized
>> > > > > > > > > > > > > feature
>> > > > > > > > > > > > > > > > version.
>> > > > > > > > > > > > > > > > Hopefully this should cover the requirement,
>> > > until
>> > > > we
>> > > > > > > find
>> > > > > > > > > the
>> > > > > > > > > > > need
>> > > > > > > > > > > > > for
>> > > > > > > > > > > > > > > > advanced downgrade support.
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > 4. "Each broker’s supported dictionary of
>> > > feature
>> > > > > > > > versions
>> > > > > > > > > > will
>> > > > > > > > > > > > be
>> > > > > > > > > > > > > > > > defined
>> > > > > > > > > > > > > > > > > in the broker code." So this means in
>> order
>> > to
>> > > > > > > restrict a
>> > > > > > > > > > > certain
>> > > > > > > > > > > > > > > > feature,
>> > > > > > > > > > > > > > > > > we need to start the broker first and then
>> > > send a
>> > > > > > > feature
>> > > > > > > > > > > gating
>> > > > > > > > > > > > > > > request
>> > > > > > > > > > > > > > > > > immediately, which introduces a time gap
>> and
>> > > the
>> > > > > > > > > > > > intended-to-close
>> > > > > > > > > > > > > > > > feature
>> > > > > > > > > > > > > > > > > could actually serve request during this
>> > phase.
>> > > > Do
>> > > > > > you
>> > > > > > > > > think
>> > > > > > > > > > we
>> > > > > > > > > > > > > > should
>> > > > > > > > > > > > > > > > also
>> > > > > > > > > > > > > > > > > support configurations as well so that
>> admin
>> > > user
>> > > > > > could
>> > > > > > > > > > freely
>> > > > > > > > > > > > roll
>> > > > > > > > > > > > > > up
>> > > > > > > > > > > > > > > a
>> > > > > > > > > > > > > > > > > cluster with all nodes complying the same
>> > > feature
>> > > > > > > gating,
>> > > > > > > > > > > without
>> > > > > > > > > > > > > > > > worrying
>> > > > > > > > > > > > > > > > > about the turnaround time to propagate the
>> > > > message
>> > > > > > only
>> > > > > > > > > after
>> > > > > > > > > > > the
>> > > > > > > > > > > > > > > cluster
>> > > > > > > > > > > > > > > > > starts up?
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > (Kowshik): This is a great point/question.
>> One
>> > of
>> > > > the
>> > > > > > > > > > > expectations
>> > > > > > > > > > > > > out
>> > > > > > > > > > > > > > of
>> > > > > > > > > > > > > > > > this KIP, which is
>> > > > > > > > > > > > > > > > already followed in the broker, is the
>> > following.
>> > > > > > > > > > > > > > > >  - Imagine at time T1 the broker starts up
>> and
>> > > > > > registers
>> > > > > > > > it’s
>> > > > > > > > > > > > > presence
>> > > > > > > > > > > > > > in
>> > > > > > > > > > > > > > > > ZK,
>> > > > > > > > > > > > > > > >    along with advertising it’s supported
>> > > features.
>> > > > > > > > > > > > > > > >  - Imagine at a future time T2 the broker
>> > > receives
>> > > > > the
>> > > > > > > > > > > > > > > > UpdateMetadataRequest
>> > > > > > > > > > > > > > > >    from the controller, which contains the
>> > latest
>> > > > > > > finalized
>> > > > > > > > > > > > features
>> > > > > > > > > > > > > as
>> > > > > > > > > > > > > > > > seen by
>> > > > > > > > > > > > > > > >    the controller. The broker validates this
>> > data
>> > > > > > against
>> > > > > > > > > it’s
>> > > > > > > > > > > > > > supported
>> > > > > > > > > > > > > > > > features to
>> > > > > > > > > > > > > > > >    make sure there is no mismatch (it will
>> > > shutdown
>> > > > > if
>> > > > > > > > there
>> > > > > > > > > is
>> > > > > > > > > > > an
>> > > > > > > > > > > > > > > > incompatibility).
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > It is expected that during the time between
>> > the 2
>> > > > > > events
>> > > > > > > T1
>> > > > > > > > > and
>> > > > > > > > > > > T2,
>> > > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > broker is
>> > > > > > > > > > > > > > > > almost a silent entity in the cluster. It
>> does
>> > > not
>> > > > > add
>> > > > > > > any
>> > > > > > > > > > value
>> > > > > > > > > > > to
>> > > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > cluster, or carry
>> > > > > > > > > > > > > > > > out any important broker activities. By
>> > > > “important”,
>> > > > > I
>> > > > > > > mean
>> > > > > > > > > it
>> > > > > > > > > > is
>> > > > > > > > > > > > not
>> > > > > > > > > > > > > > > doing
>> > > > > > > > > > > > > > > > mutations
>> > > > > > > > > > > > > > > > on it’s persistence, not mutating critical
>> > > > in-memory
>> > > > > > > state,
>> > > > > > > > > > won’t
>> > > > > > > > > > > > be
>> > > > > > > > > > > > > > > > serving
>> > > > > > > > > > > > > > > > produce/fetch requests. Note it doesn’t even
>> > know
>> > > > > it’s
>> > > > > > > > > assigned
>> > > > > > > > > > > > > > > partitions
>> > > > > > > > > > > > > > > > until
>> > > > > > > > > > > > > > > > it receives UpdateMetadataRequest from
>> > > controller.
>> > > > > > > Anything
>> > > > > > > > > the
>> > > > > > > > > > > > > broker
>> > > > > > > > > > > > > > is
>> > > > > > > > > > > > > > > > doing up
>> > > > > > > > > > > > > > > > until this point is not damaging/useful.
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > I’ve clarified the above in the KIP, see
>> this
>> > new
>> > > > > > > section:
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features#KIP-584:Versioningschemeforfeatures-Incompatiblebrokerlifetime
>> > > > > > > > > > > > > > > > .
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > 5. "adding a new Feature, updating or
>> > deleting
>> > > an
>> > > > > > > > existing
>> > > > > > > > > > > > > Feature",
>> > > > > > > > > > > > > > > may
>> > > > > > > > > > > > > > > > be
>> > > > > > > > > > > > > > > > > I misunderstood something, I thought the
>> > > features
>> > > > > are
>> > > > > > > > > defined
>> > > > > > > > > > > in
>> > > > > > > > > > > > > > broker
>> > > > > > > > > > > > > > > > > code, so admin could not really create a
>> new
>> > > > > feature?
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > (Kowshik): Great point! You understood this
>> > > right.
>> > > > > Here
>> > > > > > > > > adding
>> > > > > > > > > > a
>> > > > > > > > > > > > > > feature
>> > > > > > > > > > > > > > > > means we are
>> > > > > > > > > > > > > > > > adding a cluster-wide finalized *max*
>> version
>> > > for a
>> > > > > > > feature
>> > > > > > > > > > that
>> > > > > > > > > > > > was
>> > > > > > > > > > > > > > > > previously never finalized.
>> > > > > > > > > > > > > > > > I have clarified this in the KIP now.
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > 6. I think we need a separate error code
>> like
>> > > > > > > > > > > > > > > FEATURE_UPDATE_IN_PROGRESS
>> > > > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > reject a concurrent feature update
>> request.
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > (Kowshik): Great point! I have modified the
>> KIP
>> > > > > adding
>> > > > > > > the
>> > > > > > > > > > above
>> > > > > > > > > > > > (see
>> > > > > > > > > > > > > > > > 'Tooling support -> Admin API changes').
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > 7. I think we haven't discussed the
>> > alternative
>> > > > > > > solution
>> > > > > > > > to
>> > > > > > > > > > > pass
>> > > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > > feature information through Zookeeper. Is
>> > that
>> > > > > > > mentioned
>> > > > > > > > in
>> > > > > > > > > > the
>> > > > > > > > > > > > KIP
>> > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > justify why using UpdateMetadata is more
>> > > > favorable?
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > (Kowshik): Nice question! The broker reads
>> > > > finalized
>> > > > > > > > feature
>> > > > > > > > > > info
>> > > > > > > > > > > > > > stored
>> > > > > > > > > > > > > > > in
>> > > > > > > > > > > > > > > > ZK,
>> > > > > > > > > > > > > > > > only during startup when it does a
>> validation.
>> > > When
>> > > > > > > serving
>> > > > > > > > > > > > > > > > `ApiVersionsRequest`, the
>> > > > > > > > > > > > > > > > broker does not read this info from ZK
>> > directly.
>> > > > I'd
>> > > > > > > > imagine
>> > > > > > > > > > the
>> > > > > > > > > > > > risk
>> > > > > > > > > > > > > > is
>> > > > > > > > > > > > > > > > that it can increase
>> > > > > > > > > > > > > > > > the ZK read QPS which can be a bottleneck
>> for
>> > the
>> > > > > > system.
>> > > > > > > > > > Today,
>> > > > > > > > > > > in
>> > > > > > > > > > > > > > Kafka
>> > > > > > > > > > > > > > > > we use the
>> > > > > > > > > > > > > > > > controller to fan out ZK updates to brokers
>> and
>> > > we
>> > > > > want
>> > > > > > > to
>> > > > > > > > > > stick
>> > > > > > > > > > > to
>> > > > > > > > > > > > > > that
>> > > > > > > > > > > > > > > > pattern to avoid
>> > > > > > > > > > > > > > > > the ZK read bottleneck when serving
>> > > > > > `ApiVersionsRequest`.
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > 8. I was under the impression that user
>> could
>> > > > > > > configure a
>> > > > > > > > > > range
>> > > > > > > > > > > > of
>> > > > > > > > > > > > > > > > > supported versions, what's the trade-off
>> for
>> > > > > allowing
>> > > > > > > > > single
>> > > > > > > > > > > > > > finalized
>> > > > > > > > > > > > > > > > > version only?
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > (Kowshik): Great question! The finalized
>> > version
>> > > > of a
>> > > > > > > > feature
>> > > > > > > > > > > > > basically
>> > > > > > > > > > > > > > > > refers to
>> > > > > > > > > > > > > > > > the cluster-wide finalized feature "maximum"
>> > > > version.
>> > > > > > For
>> > > > > > > > > > > example,
>> > > > > > > > > > > > if
>> > > > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > 'group_coordinator' feature
>> > > > > > > > > > > > > > > > has the finalized version set to 10, then,
>> it
>> > > means
>> > > > > > that
>> > > > > > > > > > > > cluster-wide
>> > > > > > > > > > > > > > all
>> > > > > > > > > > > > > > > > versions upto v10 are
>> > > > > > > > > > > > > > > > supported for this feature. However, note
>> that
>> > if
>> > > > > some
>> > > > > > > > > version
>> > > > > > > > > > > (ex:
>> > > > > > > > > > > > > v0)
>> > > > > > > > > > > > > > > > gets deprecated
>> > > > > > > > > > > > > > > > for this feature, then we don’t convey that
>> > using
>> > > > > this
>> > > > > > > > scheme
>> > > > > > > > > > > (also
>> > > > > > > > > > > > > > > > supporting deprecation is a non-goal).
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > (Kowshik): I’ve now modified the KIP at all
>> > > points,
>> > > > > > > > refering
>> > > > > > > > > to
>> > > > > > > > > > > > > > finalized
>> > > > > > > > > > > > > > > > feature "maximum" versions.
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > 9. One minor syntax fix: Note that here
>> the
>> > > > > "client"
>> > > > > > > here
>> > > > > > > > > may
>> > > > > > > > > > > be
>> > > > > > > > > > > > a
>> > > > > > > > > > > > > > > > producer
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > (Kowshik): Great point! Done.
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > Cheers,
>> > > > > > > > > > > > > > > > Kowshik
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > On Tue, Mar 31, 2020 at 1:17 PM Boyang Chen
>> <
>> > > > > > > > > > > > > > reluctanthero104@gmail.com>
>> > > > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > Hey Kowshik,
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > thanks for the revised KIP. Got a couple
>> of
>> > > > > > questions:
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > 1. "When is it safe for the brokers to
>> begin
>> > > > > handling
>> > > > > > > EOS
>> > > > > > > > > > > > traffic"
>> > > > > > > > > > > > > > > could
>> > > > > > > > > > > > > > > > be
>> > > > > > > > > > > > > > > > > converted as "When is it safe for the
>> brokers
>> > > to
>> > > > > > start
>> > > > > > > > > > serving
>> > > > > > > > > > > > new
>> > > > > > > > > > > > > > > > > Exactly-Once(EOS) semantics" since EOS is
>> not
>> > > > > > explained
>> > > > > > > > > > earlier
>> > > > > > > > > > > > in
>> > > > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > > context.
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > 2. In the *Explanation *section, the
>> metadata
>> > > > > version
>> > > > > > > > > number
>> > > > > > > > > > > part
>> > > > > > > > > > > > > > > seems a
>> > > > > > > > > > > > > > > > > bit blurred. Could you point a reference
>> to
>> > > later
>> > > > > > > section
>> > > > > > > > > > that
>> > > > > > > > > > > we
>> > > > > > > > > > > > > > going
>> > > > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > store it in Zookeeper and update it every
>> > time
>> > > > when
>> > > > > > > there
>> > > > > > > > > is
>> > > > > > > > > > a
>> > > > > > > > > > > > > > feature
>> > > > > > > > > > > > > > > > > change?
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > 3. For the feature downgrade, although
>> it's a
>> > > > > > Non-goal
>> > > > > > > of
>> > > > > > > > > the
>> > > > > > > > > > > > KIP,
>> > > > > > > > > > > > > > for
>> > > > > > > > > > > > > > > > > features such as group coordinator
>> semantics,
>> > > > there
>> > > > > > is
>> > > > > > > no
>> > > > > > > > > > legal
>> > > > > > > > > > > > > > > scenario
>> > > > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > perform a downgrade at all. So having
>> > downgrade
>> > > > > door
>> > > > > > > open
>> > > > > > > > > is
>> > > > > > > > > > > > pretty
>> > > > > > > > > > > > > > > > > error-prone as human faults happen all the
>> > > time.
>> > > > > I'm
>> > > > > > > > > assuming
>> > > > > > > > > > > as
>> > > > > > > > > > > > > new
>> > > > > > > > > > > > > > > > > features are implemented, it's not very
>> hard
>> > to
>> > > > > add a
>> > > > > > > > flag
>> > > > > > > > > > > during
>> > > > > > > > > > > > > > > feature
>> > > > > > > > > > > > > > > > > creation to indicate whether this feature
>> is
>> > > > > > > > > "downgradable".
>> > > > > > > > > > > > Could
>> > > > > > > > > > > > > > you
>> > > > > > > > > > > > > > > > > explain a bit more on the extra
>> engineering
>> > > > effort
>> > > > > > for
>> > > > > > > > > > shipping
>> > > > > > > > > > > > > this
>> > > > > > > > > > > > > > > KIP
>> > > > > > > > > > > > > > > > > with downgrade protection in place?
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > 4. "Each broker’s supported dictionary of
>> > > feature
>> > > > > > > > versions
>> > > > > > > > > > will
>> > > > > > > > > > > > be
>> > > > > > > > > > > > > > > > defined
>> > > > > > > > > > > > > > > > > in the broker code." So this means in
>> order
>> > to
>> > > > > > > restrict a
>> > > > > > > > > > > certain
>> > > > > > > > > > > > > > > > feature,
>> > > > > > > > > > > > > > > > > we need to start the broker first and then
>> > > send a
>> > > > > > > feature
>> > > > > > > > > > > gating
>> > > > > > > > > > > > > > > request
>> > > > > > > > > > > > > > > > > immediately, which introduces a time gap
>> and
>> > > the
>> > > > > > > > > > > > intended-to-close
>> > > > > > > > > > > > > > > > feature
>> > > > > > > > > > > > > > > > > could actually serve request during this
>> > phase.
>> > > > Do
>> > > > > > you
>> > > > > > > > > think
>> > > > > > > > > > we
>> > > > > > > > > > > > > > should
>> > > > > > > > > > > > > > > > also
>> > > > > > > > > > > > > > > > > support configurations as well so that
>> admin
>> > > user
>> > > > > > could
>> > > > > > > > > > freely
>> > > > > > > > > > > > roll
>> > > > > > > > > > > > > > up
>> > > > > > > > > > > > > > > a
>> > > > > > > > > > > > > > > > > cluster with all nodes complying the same
>> > > feature
>> > > > > > > gating,
>> > > > > > > > > > > without
>> > > > > > > > > > > > > > > > worrying
>> > > > > > > > > > > > > > > > > about the turnaround time to propagate the
>> > > > message
>> > > > > > only
>> > > > > > > > > after
>> > > > > > > > > > > the
>> > > > > > > > > > > > > > > cluster
>> > > > > > > > > > > > > > > > > starts up?
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > 5. "adding a new Feature, updating or
>> > deleting
>> > > an
>> > > > > > > > existing
>> > > > > > > > > > > > > Feature",
>> > > > > > > > > > > > > > > may
>> > > > > > > > > > > > > > > > be
>> > > > > > > > > > > > > > > > > I misunderstood something, I thought the
>> > > features
>> > > > > are
>> > > > > > > > > defined
>> > > > > > > > > > > in
>> > > > > > > > > > > > > > broker
>> > > > > > > > > > > > > > > > > code, so admin could not really create a
>> new
>> > > > > feature?
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > 6. I think we need a separate error code
>> like
>> > > > > > > > > > > > > > > FEATURE_UPDATE_IN_PROGRESS
>> > > > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > reject a concurrent feature update
>> request.
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > 7. I think we haven't discussed the
>> > alternative
>> > > > > > > solution
>> > > > > > > > to
>> > > > > > > > > > > pass
>> > > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > > feature information through Zookeeper. Is
>> > that
>> > > > > > > mentioned
>> > > > > > > > in
>> > > > > > > > > > the
>> > > > > > > > > > > > KIP
>> > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > justify why using UpdateMetadata is more
>> > > > favorable?
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > 8. I was under the impression that user
>> could
>> > > > > > > configure a
>> > > > > > > > > > range
>> > > > > > > > > > > > of
>> > > > > > > > > > > > > > > > > supported versions, what's the trade-off
>> for
>> > > > > allowing
>> > > > > > > > > single
>> > > > > > > > > > > > > > finalized
>> > > > > > > > > > > > > > > > > version only?
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > 9. One minor syntax fix: Note that here
>> the
>> > > > > "client"
>> > > > > > > here
>> > > > > > > > > may
>> > > > > > > > > > > be
>> > > > > > > > > > > > a
>> > > > > > > > > > > > > > > > producer
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > Boyang
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > On Mon, Mar 30, 2020 at 4:53 PM Colin
>> McCabe
>> > <
>> > > > > > > > > > > cmccabe@apache.org
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > On Thu, Mar 26, 2020, at 19:24, Kowshik
>> > > > Prakasam
>> > > > > > > wrote:
>> > > > > > > > > > > > > > > > > > > Hi Colin,
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > Thanks for the feedback! I've changed
>> the
>> > > KIP
>> > > > > to
>> > > > > > > > > address
>> > > > > > > > > > > your
>> > > > > > > > > > > > > > > > > > > suggestions.
>> > > > > > > > > > > > > > > > > > > Please find below my explanation. Here
>> > is a
>> > > > > link
>> > > > > > to
>> > > > > > > > KIP
>> > > > > > > > > > > 584:
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features
>> > > > > > > > > > > > > > > > > > > .
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > 1. '__data_version__' is the version
>> of
>> > the
>> > > > > > > finalized
>> > > > > > > > > > > feature
>> > > > > > > > > > > > > > > > metadata
>> > > > > > > > > > > > > > > > > > > (i.e. actual ZK node contents), while
>> the
>> > > > > > > > > > > > '__schema_version__'
>> > > > > > > > > > > > > is
>> > > > > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > > > > version of the schema of the data
>> > persisted
>> > > > in
>> > > > > > ZK.
>> > > > > > > > > These
>> > > > > > > > > > > > serve
>> > > > > > > > > > > > > > > > > different
>> > > > > > > > > > > > > > > > > > > purposes. '__data_version__' is is
>> useful
>> > > > > mainly
>> > > > > > to
>> > > > > > > > > > clients
>> > > > > > > > > > > > > > during
>> > > > > > > > > > > > > > > > > reads,
>> > > > > > > > > > > > > > > > > > > to differentiate between the 2
>> versions
>> > of
>> > > > > > > eventually
>> > > > > > > > > > > > > consistent
>> > > > > > > > > > > > > > > > > > 'finalized
>> > > > > > > > > > > > > > > > > > > features' metadata (i.e. larger
>> metadata
>> > > > > version
>> > > > > > is
>> > > > > > > > > more
>> > > > > > > > > > > > > recent).
>> > > > > > > > > > > > > > > > > > > '__schema_version__' provides an
>> > additional
>> > > > > > degree
>> > > > > > > of
>> > > > > > > > > > > > > > flexibility,
>> > > > > > > > > > > > > > > > > where
>> > > > > > > > > > > > > > > > > > if
>> > > > > > > > > > > > > > > > > > > we decide to change the schema for
>> > > > '/features'
>> > > > > > node
>> > > > > > > > in
>> > > > > > > > > ZK
>> > > > > > > > > > > (in
>> > > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > > > future),
>> > > > > > > > > > > > > > > > > > > then we can manage broker roll outs
>> > > suitably
>> > > > > > (i.e.
>> > > > > > > > > > > > > > > > > > > serialization/deserialization of the
>> ZK
>> > > data
>> > > > > can
>> > > > > > be
>> > > > > > > > > > handled
>> > > > > > > > > > > > > > > safely).
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > Hi Kowshik,
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > If you're talking about a number that
>> lets
>> > > you
>> > > > > know
>> > > > > > > if
>> > > > > > > > > data
>> > > > > > > > > > > is
>> > > > > > > > > > > > > more
>> > > > > > > > > > > > > > > or
>> > > > > > > > > > > > > > > > > > less recent, we would typically call
>> that
>> > an
>> > > > > epoch,
>> > > > > > > and
>> > > > > > > > > > not a
>> > > > > > > > > > > > > > > version.
>> > > > > > > > > > > > > > > > > For
>> > > > > > > > > > > > > > > > > > the ZK data structures, the word
>> "version"
>> > is
>> > > > > > > typically
>> > > > > > > > > > > > reserved
>> > > > > > > > > > > > > > for
>> > > > > > > > > > > > > > > > > > describing changes to the overall
>> schema of
>> > > the
>> > > > > > data
>> > > > > > > > that
>> > > > > > > > > > is
>> > > > > > > > > > > > > > written
>> > > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > > ZooKeeper.  We don't even really change
>> the
>> > > > > > "version"
>> > > > > > > > of
>> > > > > > > > > > > those
>> > > > > > > > > > > > > > > schemas
>> > > > > > > > > > > > > > > > > that
>> > > > > > > > > > > > > > > > > > much, since most changes are
>> > > > > backwards-compatible.
>> > > > > > > But
>> > > > > > > > > we
>> > > > > > > > > > do
>> > > > > > > > > > > > > > include
>> > > > > > > > > > > > > > > > > that
>> > > > > > > > > > > > > > > > > > version field just in case.
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > I don't think we really need an epoch
>> here,
>> > > > > though,
>> > > > > > > > since
>> > > > > > > > > > we
>> > > > > > > > > > > > can
>> > > > > > > > > > > > > > just
>> > > > > > > > > > > > > > > > > look
>> > > > > > > > > > > > > > > > > > at the broker epoch.  Whenever the
>> broker
>> > > > > > registers,
>> > > > > > > > its
>> > > > > > > > > > > epoch
>> > > > > > > > > > > > > will
>> > > > > > > > > > > > > > > be
>> > > > > > > > > > > > > > > > > > greater than the previous broker epoch.
>> > And
>> > > > the
>> > > > > > > newly
>> > > > > > > > > > > > registered
>> > > > > > > > > > > > > > > data
>> > > > > > > > > > > > > > > > > will
>> > > > > > > > > > > > > > > > > > take priority.  This will be a lot
>> simpler
>> > > than
>> > > > > > > adding
>> > > > > > > > a
>> > > > > > > > > > > > separate
>> > > > > > > > > > > > > > > epoch
>> > > > > > > > > > > > > > > > > > system, I think.
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > 2. Regarding admin client needing min
>> and
>> > > max
>> > > > > > > > > > information -
>> > > > > > > > > > > > you
>> > > > > > > > > > > > > > are
>> > > > > > > > > > > > > > > > > > right!
>> > > > > > > > > > > > > > > > > > > I've changed the KIP such that the
>> Admin
>> > > API
>> > > > > also
>> > > > > > > > > allows
>> > > > > > > > > > > the
>> > > > > > > > > > > > > user
>> > > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > read
>> > > > > > > > > > > > > > > > > > > 'supported features' from a specific
>> > > broker.
>> > > > > > Please
>> > > > > > > > > look
>> > > > > > > > > > at
>> > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > section
>> > > > > > > > > > > > > > > > > > > "Admin API changes".
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > Thanks.
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > 3. Regarding the use of `long` vs
>> `Long`
>> > -
>> > > it
>> > > > > was
>> > > > > > > not
>> > > > > > > > > > > > > deliberate.
>> > > > > > > > > > > > > > > > I've
>> > > > > > > > > > > > > > > > > > > improved the KIP to just use `long` at
>> > all
>> > > > > > places.
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > Sounds good.
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > 4. Regarding
>> kafka.admin.FeatureCommand
>> > > tool
>> > > > -
>> > > > > > you
>> > > > > > > > are
>> > > > > > > > > > > right!
>> > > > > > > > > > > > > > I've
>> > > > > > > > > > > > > > > > > > updated
>> > > > > > > > > > > > > > > > > > > the KIP sketching the functionality
>> > > provided
>> > > > by
>> > > > > > > this
>> > > > > > > > > > tool,
>> > > > > > > > > > > > with
>> > > > > > > > > > > > > > > some
>> > > > > > > > > > > > > > > > > > > examples. Please look at the section
>> > > "Tooling
>> > > > > > > support
>> > > > > > > > > > > > > examples".
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > Thank you!
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > Thanks, Kowshik.
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > cheers,
>> > > > > > > > > > > > > > > > > > Colin
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > Cheers,
>> > > > > > > > > > > > > > > > > > > Kowshik
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > On Wed, Mar 25, 2020 at 11:31 PM Colin
>> > > > McCabe <
>> > > > > > > > > > > > > > cmccabe@apache.org>
>> > > > > > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > Thanks, Kowshik, this looks good.
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > In the "Schema" section, do we
>> really
>> > > need
>> > > > > both
>> > > > > > > > > > > > > > > __schema_version__
>> > > > > > > > > > > > > > > > > and
>> > > > > > > > > > > > > > > > > > > > __data_version__?  Can we just have
>> a
>> > > > single
>> > > > > > > > version
>> > > > > > > > > > > field
>> > > > > > > > > > > > > > here?
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > Shouldn't the Admin(Client) function
>> > have
>> > > > > some
>> > > > > > > way
>> > > > > > > > to
>> > > > > > > > > > get
>> > > > > > > > > > > > the
>> > > > > > > > > > > > > > min
>> > > > > > > > > > > > > > > > and
>> > > > > > > > > > > > > > > > > > max
>> > > > > > > > > > > > > > > > > > > > information that we're exposing as
>> > > well?  I
>> > > > > > guess
>> > > > > > > > we
>> > > > > > > > > > > could
>> > > > > > > > > > > > > have
>> > > > > > > > > > > > > > > > min,
>> > > > > > > > > > > > > > > > > > max,
>> > > > > > > > > > > > > > > > > > > > and current.  Unrelated: is the use
>> of
>> > > Long
>> > > > > > > rather
>> > > > > > > > > than
>> > > > > > > > > > > > long
>> > > > > > > > > > > > > > > > > deliberate
>> > > > > > > > > > > > > > > > > > > > here?
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > It would be good to describe how the
>> > > > command
>> > > > > > line
>> > > > > > > > > tool
>> > > > > > > > > > > > > > > > > > > > kafka.admin.FeatureCommand will
>> work.
>> > > For
>> > > > > > > example
>> > > > > > > > > the
>> > > > > > > > > > > > flags
>> > > > > > > > > > > > > > that
>> > > > > > > > > > > > > > > > it
>> > > > > > > > > > > > > > > > > > will
>> > > > > > > > > > > > > > > > > > > > take and the output that it will
>> > generate
>> > > > to
>> > > > > > > > STDOUT.
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > cheers,
>> > > > > > > > > > > > > > > > > > > > Colin
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > On Tue, Mar 24, 2020, at 17:08,
>> Kowshik
>> > > > > > Prakasam
>> > > > > > > > > wrote:
>> > > > > > > > > > > > > > > > > > > > > Hi all,
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > I've opened KIP-584
>> > > > > > > > > > > > > > > <
>> https://issues.apache.org/jira/browse/KIP-584>
>> > <
>> > > > > > > > > > > > > > > >
>> https://issues.apache.org/jira/browse/KIP-584
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > which
>> > > > > > > > > > > > > > > > > > > > > is intended to provide a
>> versioning
>> > > > scheme
>> > > > > > for
>> > > > > > > > > > > features.
>> > > > > > > > > > > > > I'd
>> > > > > > > > > > > > > > > like
>> > > > > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > > use
>> > > > > > > > > > > > > > > > > > > > > this thread to discuss the same.
>> I'd
>> > > > > > appreciate
>> > > > > > > > any
>> > > > > > > > > > > > > feedback
>> > > > > > > > > > > > > > on
>> > > > > > > > > > > > > > > > > this.
>> > > > > > > > > > > > > > > > > > > > > Here
>> > > > > > > > > > > > > > > > > > > > > is a link to KIP-584
>> > > > > > > > > > > > > > > <
>> https://issues.apache.org/jira/browse/KIP-584>:
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-584%3A+Versioning+scheme+for+features
>> > > > > > > > > > > > > > > > > > > > >  .
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > Thank you!
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > > > Cheers,
>> > > > > > > > > > > > > > > > > > > > > Kowshik
>> > > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message