kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tom Bentley <t.j.bent...@gmail.com>
Subject Re: [DISCUSS] KIP-201: Rationalising Policy interfaces
Date Thu, 05 Oct 2017 11:41:25 GMT
I'd like to raise a somewhat subtle point about how the proposed API should
behave.

The current CreateTopicPolicy gets passed either the request partition
count and replication factor, or the requested assignment. So if the
request had specified partition count and replication factor, the policy
sees a null replicaAssignments(). Likewise if the request specified a
replica assignment the policy would get back null from numPartitions() and
replicationFactor().

These semantics mean the policy can't reject an assignment that happened to
be auto-generated (or rather, it's obvious to the policy that the
assignment is auto generated, because it can't see such assignments),
though it can reject a request because the assignment was auto-generated,
or vice versa.

Retaining these semantics makes the TopicState less symmetric between it's
use in requestedState() and the current state available from the
ClusterState, and also less symmetric between its use for createTopic() and
for alterTopic(). This can make it harder to write a policy. For example,
if I want the policy "the number of partitions must be < 100", if the
requestedState().numPartitions() can be null I need to cope with that and
figure it out from inspecting the replicasAssignments(). It would be much
better for the policy writer to just be able to write:

    if (request.requestedState().numPartitions() >= 100)
        throw new PolicyViolationException("#partitions must be < 100")

An alternative would be to keep the symmetry (and thus
TopicState.replicasAssignments() would never return null, and
TopicState.numPartitions() and TopicState.replicationFactor() could each be
primitives), but expose the auto-generatedness of the replicaAssignments()
explicitly, perhaps by using a subtype of TopicState for the return type of
requestedState():

    interface RequestedTopicState extends TopicState {
        /**
         * True if the {@link TopicState#replicasAssignments()}
         * in this request we generated by the broker, false if
         * they were explicitly requested by the client.
         */
        boolean generatedReplicaAssignments();
    }

Thoughts?

On 4 October 2017 at 11:06, Tom Bentley <t.j.bentley@gmail.com> wrote:

> Good point. Then I guess I can do those items too. I would also need to do
> the same changes for DeleteRecordsRequest and Response.
>
> On 4 October 2017 at 10:37, Ismael Juma <ismael@juma.me.uk> wrote:
>
>> Those two points are related to policies in the following sense:
>>
>> 1. A policy that can't send errors to clients is much less useful
>> 2. Testing policies is much easier with `validateOnly`
>>
>> Ismael
>>
>> On Wed, Oct 4, 2017 at 9:20 AM, Tom Bentley <t.j.bentley@gmail.com>
>> wrote:
>>
>> > Thanks Edoardo,
>> >
>> > I've added that motivation to the KIP.
>> >
>> > KIP-201 doesn't address two points raised in KIP-170: Adding a
>> > validationOnly flag to
>> > DeleteTopicRequest and adding an error message to DeleteTopicResponse.
>> > Since those are not policy-related I think they're best left out of
>> > KIP-201. I suppose it is up to you and Mickael whether to narrow the
>> scope
>> > of KIP-170 to address those points.
>> >
>> > Thanks again,
>> >
>> > Tom
>> >
>> > On 4 October 2017 at 08:20, Edoardo Comar <ECOMAR@uk.ibm.com> wrote:
>> >
>> > > Thanks Tom,
>> > > looks got to me and KIP-201 could supersede KIP-170
>> > > but could you please add a missing motivation bullet that was behind
>> > > KIP-170:
>> > >
>> > > introducing ClusterState to allow validation of create/alter topic
>> > request
>> > >
>> > > not just against the request metadata but also
>> > > against the current amount of resources already used in the cluster
>> (eg
>> > > number of partitions).
>> > >
>> > > thanks
>> > > Edo
>> > > --------------------------------------------------
>> > >
>> > > Edoardo Comar
>> > >
>> > > IBM Message Hub
>> > >
>> > > IBM UK Ltd, Hursley Park, SO21 2JN
>> > >
>> > >
>> > >
>> > > From:   Tom Bentley <t.j.bentley@gmail.com>
>> > > To:     dev@kafka.apache.org
>> > > Date:   02/10/2017 15:15
>> > > Subject:        Re: [DISCUSS] KIP-201: Rationalising Policy interfaces
>> > >
>> > >
>> > >
>> > > Hi All,
>> > >
>> > > I've updated KIP-201 again so there is now a single policy interface
>> (and
>> > > thus a single key by which to configure it) for topic creation,
>> > > modification, deletion and record deletion, which each have their own
>> > > validation method.
>> > >
>> > > There are still a few loose ends:
>> > >
>> > > 1. I currently propose validateAlterTopic(), but it would be possible
>> to
>> > > be
>> > > more fine grained about this: validateAlterConfig(),
>> validAddPartitions()
>> > > and validateReassignPartitions(), for example. Obviously this results
>> in
>> > a
>> > > policy method per operation, and makes it more clear what is being
>> > > changed.
>> > > I guess the down side is its more work for implementer, and
>> potentially
>> > > makes it harder to change the interface in the future.
>> > >
>> > > 2. A couple of TODOs about what the TopicState interface should return
>> > > when
>> > > a topic's partitions are being reassigned.
>> > >
>> > > Your thoughts on these or any other points are welcome.
>> > >
>> > > Thanks,
>> > >
>> > > Tom
>> > >
>> > > On 27 September 2017 at 11:45, Paolo Patierno <ppatierno@live.com>
>> > wrote:
>> > >
>> > > > Hi Ismael,
>> > > >
>> > > >
>> > > >   1.  I don't have a real requirement now but "deleting" is an
>> > operation
>> > > > that could be really dangerous so it's always better having a way
>> for
>> > > > having more control on that. I know that we have the authorizer used
>> > for
>> > > > that (delete on topic) but fine grained control could be better
>> (even
>> > > > already happens for topic deletion).
>> > > >   2.  I know about the problem of restarting broker due to changes
>> on
>> > > > policies but what do you mean by doing that on the clients ?
>> > > >
>> > > >
>> > > > Paolo Patierno
>> > > > Senior Software Engineer (IoT) @ Red Hat
>> > > > Microsoft MVP on Azure & IoT
>> > > > Microsoft Azure Advisor
>> > > >
>> > > > Twitter : @ppatierno<
>> > > https://urldefense.proofpoint.com/v2/url?u=http-3A__twitter.
>> > > com_ppatierno&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=
>> > > EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ&m=h-D-nA7uiy1Z-jta5y-
>> > > yh7dKgV77XtsUnJ9Rab1gheY&s=43hzTLEDKw2v5Vh0zwkMTaaKD-
>> > HdJD8d_F4-Bsw25-Y&e=
>> > > >
>> > > > Linkedin : paolopatierno<
>> > > https://urldefense.proofpoint.com/v2/url?u=http-3A__it.
>> > > linkedin.com_in_paolopatierno&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=
>> > > EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ&m=h-D-nA7uiy1Z-jta5y-
>> > > yh7dKgV77XtsUnJ9Rab1gheY&s=Ig0N7Nwf9EHfTJ2pH3jRM1JIdlzXw6
>> > R5Drocu0TMRLk&e=
>> > > >
>> > > > Blog : DevExperience<
>> > > https://urldefense.proofpoint.com/v2/url?u=http-3A__
>> > > paolopatierno.wordpress.com_&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=
>> > > EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ&m=h-D-nA7uiy1Z-jta5y-
>> > > yh7dKgV77XtsUnJ9Rab1gheY&s=Tc9NrTtG2GP7-zRjOHkXHfYI0rncO8_
>> > jKpedna692z4&e=
>> > > >
>> > > >
>> > > >
>> > > > ________________________________
>> > > > From: ismaelj@gmail.com <ismaelj@gmail.com> on behalf of Ismael
>> Juma <
>> > > > ismael@juma.me.uk>
>> > > > Sent: Wednesday, September 27, 2017 10:30 AM
>> > > > To: dev@kafka.apache.org
>> > > > Subject: Re: [DISCUSS] KIP-201: Rationalising Policy interfaces
>> > > >
>> > > > A couple of questions:
>> > > >
>> > > > 1. Is this a concrete requirement from a user or is it hypothetical?
>> > > > 2. You sure you would want to do this in the broker instead of the
>> > > clients?
>> > > > It's worth remembering that updating broker policies involves a
>> rolling
>> > > > restart of the cluster, so it's not the right place for things that
>> > > change
>> > > > frequently.
>> > > >
>> > > > Ismael
>> > > >
>> > > > On Wed, Sep 27, 2017 at 11:26 AM, Paolo Patierno <
>> ppatierno@live.com>
>> > > > wrote:
>> > > >
>> > > > > Hi Ismael,
>> > > > >
>> > > > > regarding motivations for delete records, as I said during the
>> > > discussion
>> > > > > on KIP-204, it gives the possibility to avoid deleting messages
>> for
>> > > > > specific partitions (inside the topic) and starting from a
>> specific
>> > > > offset.
>> > > > > I could think on some users solutions where they know exactly
what
>> > the
>> > > > > partitions means in a specific topic (because they are using
a
>> custom
>> > > > > partitioner on the producer side) so they know what kind of
>> messages
>> > > are
>> > > > > inside a partition allowing to delete them but not the others.
 In
>> > > such a
>> > > > > policy a user could also check the timestamp related to the offset
>> > for
>> > > > > allowing or not deletion on time base.
>> > > > >
>> > > > >
>> > > > > Paolo Patierno
>> > > > > Senior Software Engineer (IoT) @ Red Hat
>> > > > > Microsoft MVP on Azure & IoT
>> > > > > Microsoft Azure Advisor
>> > > > >
>> > > > > Twitter : @ppatierno<
>> > > https://urldefense.proofpoint.com/v2/url?u=http-3A__twitter.
>> > > com_ppatierno&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=
>> > > EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ&m=h-D-nA7uiy1Z-jta5y-
>> > > yh7dKgV77XtsUnJ9Rab1gheY&s=43hzTLEDKw2v5Vh0zwkMTaaKD-
>> > HdJD8d_F4-Bsw25-Y&e=
>> > > >
>> > > > > Linkedin : paolopatierno<
>> > > https://urldefense.proofpoint.com/v2/url?u=http-3A__it.
>> > > linkedin.com_in_paolopatierno&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=
>> > > EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ&m=h-D-nA7uiy1Z-jta5y-
>> > > yh7dKgV77XtsUnJ9Rab1gheY&s=Ig0N7Nwf9EHfTJ2pH3jRM1JIdlzXw6
>> > R5Drocu0TMRLk&e=
>> > > >
>> > > > > Blog : DevExperience<
>> > > https://urldefense.proofpoint.com/v2/url?u=http-3A__
>> > > paolopatierno.wordpress.com_&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=
>> > > EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ&m=h-D-nA7uiy1Z-jta5y-
>> > > yh7dKgV77XtsUnJ9Rab1gheY&s=Tc9NrTtG2GP7-zRjOHkXHfYI0rncO8_
>> > jKpedna692z4&e=
>> > > >
>> > > > >
>> > > > >
>> > > > > ________________________________
>> > > > > From: ismaelj@gmail.com <ismaelj@gmail.com> on behalf of
Ismael
>> > Juma <
>> > > > > ismael@juma.me.uk>
>> > > > > Sent: Wednesday, September 27, 2017 10:18 AM
>> > > > > To: dev@kafka.apache.org
>> > > > > Subject: Re: [DISCUSS] KIP-201: Rationalising Policy interfaces
>> > > > >
>> > > > > A couple more comments:
>> > > > >
>> > > > > 1. "If this KIP is accepted for Kafka 1.1.0 this removal could
>> happen
>> > > in
>> > > > > Kafka 1.2.0 or a later release." -> we only remove code in
major
>> > > > releases.
>> > > > > So, if it's deprecated in 1.1.0, it would be removed in 2.0.0.
>> > > > >
>> > > > > 2. Deleting all messages in a topic is not really the same as
>> > deleting
>> > > a
>> > > > > topic. The latter will cause consumers and producers to error
out
>> > > while
>> > > > the
>> > > > > former will not. It would be good to motivate the need for the
>> delete
>> > > > > records policy more.
>> > > > >
>> > > > > Ismael
>> > > > >
>> > > > > On Wed, Sep 27, 2017 at 11:12 AM, Ismael Juma <ismael@juma.me.uk>
>> > > wrote:
>> > > > >
>> > > > > > Another quick comment: the KIP states that having multiple
>> > > interfaces
>> > > > > > imply that the logic must be in 2 places. That is not true
>> because
>> > > the
>> > > > > same
>> > > > > > class can implement multiple interfaces (this aspect was
>> considered
>> > > > when
>> > > > > we
>> > > > > > decided to introduce policies incrementally).
>> > > > > >
>> > > > > > The main reason why I think the original approach doesn't
work
>> well
>> > > is
>> > > > > > that there is no direct mapping between an operation and
the
>> > policy.
>> > > > That
>> > > > > > is, we initially thought we would have create/alter/delete
>> topics,
>> > > but
>> > > > > that
>> > > > > > didn't work out as the alter case is better served by multiple
>> > > request
>> > > > > > types. Given that, it's a bit awkward to maintain the original
>> > > approach
>> > > > > and
>> > > > > > a policy for topic management seemed easier to understand.
On
>> that
>> > > > note,
>> > > > > > would `TopicManagementPolicy` be a better name?
>> > > > > >
>> > > > > > Looking at the updated KIP, I notice that we actually have
a
>> > > > > > TopicDeletionPolicy with a separate config. That seems to
be a
>> > > halfway
>> > > > > > house. Not sure about that.
>> > > > > >
>> > > > > > Ismael
>> > > > > >
>> > > > > > On Wed, Sep 27, 2017 at 10:47 AM, Tom Bentley
>> > > <t.j.bentley@gmail.com>
>> > > > > > wrote:
>> > > > > >
>> > > > > >> I have updated the KIP to add a common policy interface
for
>> topic
>> > > and
>> > > > > >> message deletion. This included pulling ClusterState
and
>> > TopicState
>> > > > > >> interfaces up to the top level so that they can be shared
>> between
>> > > the
>> > > > > two
>> > > > > >> policies.
>> > > > > >>
>> > > > > >> Cheers,
>> > > > > >>
>> > > > > >> Tom
>> > > > > >>
>> > > > > >> On 26 September 2017 at 18:09, Edoardo Comar <
>> ECOMAR@uk.ibm.com>
>> > > > wrote:
>> > > > > >>
>> > > > > >> > Thanks Tom,
>> > > > > >> > In my original KIP-170 I mentioned that the method
>> > > > > >> >
>> > > > > >> > public Map<String, Integer> topicsPartitionCount();
>> > > > > >> >
>> > > > > >> > was just a starting point for a general purpose
ClusterState
>> > > > > >> > as it happened to be exactly the info we needed
for our
>> policy
>> > > > > >> > implementation :-)
>> > > > > >> > it definitely doesn't feel general purpose enough.
>> > > > > >> >
>> > > > > >> > what about
>> > > > > >> >
>> > > > > >> >     interface ClusterState {
>> > > > > >> >         public TopicState topicState(String topicName);
>> > > > > >> >         public Set<String> topics();
>> > > > > >> >     }
>> > > > > >> >
>> > > > > >> > I think that the implementation of ClusterState
that the
>> server
>> > > will
>> > > > > >> pass
>> > > > > >> > to the policy.validate method
>> > > > > >> > would just lazily tap into MetadataCache. No need
for big
>> > upfront
>> > > > > >> > allocations.
>> > > > > >> >
>> > > > > >> > ciao,
>> > > > > >> > Edo
>> > > > > >> > --------------------------------------------------
>> > > > > >> >
>> > > > > >> > Edoardo Comar
>> > > > > >> >
>> > > > > >> > IBM Message Hub
>> > > > > >> >
>> > > > > >> > IBM UK Ltd, Hursley Park, SO21 2JN
>> > > > > >> >
>> > > > > >> >
>> > > > > >> >
>> > > > > >> > From:   Tom Bentley <t.j.bentley@gmail.com>
>> > > > > >> > To:     dev@kafka.apache.org
>> > > > > >> > Date:   26/09/2017 17:39
>> > > > > >> > Subject:        Re: [DISCUSS] KIP-201: Rationalising
Policy
>> > > > interfaces
>> > > > > >> >
>> > > > > >> >
>> > > > > >> >
>> > > > > >> > Hi Edoardo,
>> > > > > >> >
>> > > > > >> >
>> > > > > >> > what about a single method in ClusterState
>> > > > > >> > >
>> > > > > >> > >     interface ClusterState {
>> > > > > >> > >         public Map<String,TopicState>
topicsState();
>> > > > > >> > >
>> > > > > >> > >     }
>> > > > > >> > >
>> > > > > >> > > which could return a read-only snapshot of
the cluster
>> > metadata
>> > > ?
>> > > > > >> > >
>> > > > > >> >
>> > > > > >> > Sure that would work too. A concern with that is
that we end
>> up
>> > > > > >> allocating
>> > > > > >> > a potentially rather large amount for the Map and
the
>> > collections
>> > > > > >> present
>> > > > > >> > in the TopicStates in order to provide the snapshot.
The
>> caller
>> > > > might
>> > > > > >> only
>> > > > > >> > be interested in one item from the TopicState for
one topic
>> in
>> > > the
>> > > > > map.
>> > > > > >> > Accessing this information via methods means the
caller only
>> > pays
>> > > > for
>> > > > > >> what
>> > > > > >> > they use.
>> > > > > >> >
>> > > > > >> > Cheers,
>> > > > > >> >
>> > > > > >> > Tom
>> > > > > >> >
>> > > > > >> >
>> > > > > >> >
>> > > > > >> > Unless stated otherwise above:
>> > > > > >> > IBM United Kingdom Limited - Registered in England
and Wales
>> > with
>> > > > > number
>> > > > > >> > 741598.
>> > > > > >> > Registered office: PO Box 41, North Harbour, Portsmouth,
>> > > Hampshire
>> > > > PO6
>> > > > > >> 3AU
>> > > > > >> >
>> > > > > >>
>> > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> > >
>> > >
>> > > Unless stated otherwise above:
>> > > IBM United Kingdom Limited - Registered in England and Wales with
>> number
>> > > 741598.
>> > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6
>> > 3AU
>> > >
>> >
>>
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message