kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tom Bentley <t.j.bent...@gmail.com>
Subject Re: [DISCUSS] KIP-201: Rationalising Policy interfaces
Date Mon, 09 Oct 2017 08:39:45 GMT
I've added RequestedTopicState, as discussed in my last email.

I've also added a paragraph to the migration plan about old clients making
policy-violating delete topics or delete records request.

If no further comments a forthcoming in the next day or two then I will
start a vote.

Thanks,

Tom

On 5 October 2017 at 12:41, Tom Bentley <t.j.bentley@gmail.com> wrote:

> I'd like to raise a somewhat subtle point about how the proposed API
> should behave.
>
> The current CreateTopicPolicy gets passed either the request partition
> count and replication factor, or the requested assignment. So if the
> request had specified partition count and replication factor, the policy
> sees a null replicaAssignments(). Likewise if the request specified a
> replica assignment the policy would get back null from numPartitions() and
> replicationFactor().
>
> These semantics mean the policy can't reject an assignment that happened
> to be auto-generated (or rather, it's obvious to the policy that the
> assignment is auto generated, because it can't see such assignments),
> though it can reject a request because the assignment was auto-generated,
> or vice versa.
>
> Retaining these semantics makes the TopicState less symmetric between it's
> use in requestedState() and the current state available from the
> ClusterState, and also less symmetric between its use for createTopic() and
> for alterTopic(). This can make it harder to write a policy. For example,
> if I want the policy "the number of partitions must be < 100", if the
> requestedState().numPartitions() can be null I need to cope with that
> and  figure it out from inspecting the replicasAssignments(). It would be
> much better for the policy writer to just be able to write:
>
>     if (request.requestedState().numPartitions() >= 100)
>         throw new PolicyViolationException("#partitions must be < 100")
>
> An alternative would be to keep the symmetry (and thus TopicState.replicasAssignments()
> would never return null, and TopicState.numPartitions() and
> TopicState.replicationFactor() could each be primitives), but expose the
> auto-generatedness of the replicaAssignments() explicitly, perhaps by using
> a subtype of TopicState for the return type of requestedState():
>
>     interface RequestedTopicState extends TopicState {
>         /**
>          * True if the {@link TopicState#replicasAssignments()}
>          * in this request we generated by the broker, false if
>          * they were explicitly requested by the client.
>          */
>         boolean generatedReplicaAssignments();
>     }
>
> Thoughts?
>
> On 4 October 2017 at 11:06, Tom Bentley <t.j.bentley@gmail.com> wrote:
>
>> Good point. Then I guess I can do those items too. I would also need to
>> do the same changes for DeleteRecordsRequest and Response.
>>
>> On 4 October 2017 at 10:37, Ismael Juma <ismael@juma.me.uk> wrote:
>>
>>> Those two points are related to policies in the following sense:
>>>
>>> 1. A policy that can't send errors to clients is much less useful
>>> 2. Testing policies is much easier with `validateOnly`
>>>
>>> Ismael
>>>
>>> On Wed, Oct 4, 2017 at 9:20 AM, Tom Bentley <t.j.bentley@gmail.com>
>>> wrote:
>>>
>>> > Thanks Edoardo,
>>> >
>>> > I've added that motivation to the KIP.
>>> >
>>> > KIP-201 doesn't address two points raised in KIP-170: Adding a
>>> > validationOnly flag to
>>> > DeleteTopicRequest and adding an error message to DeleteTopicResponse.
>>> > Since those are not policy-related I think they're best left out of
>>> > KIP-201. I suppose it is up to you and Mickael whether to narrow the
>>> scope
>>> > of KIP-170 to address those points.
>>> >
>>> > Thanks again,
>>> >
>>> > Tom
>>> >
>>> > On 4 October 2017 at 08:20, Edoardo Comar <ECOMAR@uk.ibm.com> wrote:
>>> >
>>> > > Thanks Tom,
>>> > > looks got to me and KIP-201 could supersede KIP-170
>>> > > but could you please add a missing motivation bullet that was behind
>>> > > KIP-170:
>>> > >
>>> > > introducing ClusterState to allow validation of create/alter topic
>>> > request
>>> > >
>>> > > not just against the request metadata but also
>>> > > against the current amount of resources already used in the cluster
>>> (eg
>>> > > number of partitions).
>>> > >
>>> > > thanks
>>> > > Edo
>>> > > --------------------------------------------------
>>> > >
>>> > > Edoardo Comar
>>> > >
>>> > > IBM Message Hub
>>> > >
>>> > > IBM UK Ltd, Hursley Park, SO21 2JN
>>> > >
>>> > >
>>> > >
>>> > > From:   Tom Bentley <t.j.bentley@gmail.com>
>>> > > To:     dev@kafka.apache.org
>>> > > Date:   02/10/2017 15:15
>>> > > Subject:        Re: [DISCUSS] KIP-201: Rationalising Policy
>>> interfaces
>>> > >
>>> > >
>>> > >
>>> > > Hi All,
>>> > >
>>> > > I've updated KIP-201 again so there is now a single policy interface
>>> (and
>>> > > thus a single key by which to configure it) for topic creation,
>>> > > modification, deletion and record deletion, which each have their own
>>> > > validation method.
>>> > >
>>> > > There are still a few loose ends:
>>> > >
>>> > > 1. I currently propose validateAlterTopic(), but it would be
>>> possible to
>>> > > be
>>> > > more fine grained about this: validateAlterConfig(),
>>> validAddPartitions()
>>> > > and validateReassignPartitions(), for example. Obviously this
>>> results in
>>> > a
>>> > > policy method per operation, and makes it more clear what is being
>>> > > changed.
>>> > > I guess the down side is its more work for implementer, and
>>> potentially
>>> > > makes it harder to change the interface in the future.
>>> > >
>>> > > 2. A couple of TODOs about what the TopicState interface should
>>> return
>>> > > when
>>> > > a topic's partitions are being reassigned.
>>> > >
>>> > > Your thoughts on these or any other points are welcome.
>>> > >
>>> > > Thanks,
>>> > >
>>> > > Tom
>>> > >
>>> > > On 27 September 2017 at 11:45, Paolo Patierno <ppatierno@live.com>
>>> > wrote:
>>> > >
>>> > > > Hi Ismael,
>>> > > >
>>> > > >
>>> > > >   1.  I don't have a real requirement now but "deleting" is an
>>> > operation
>>> > > > that could be really dangerous so it's always better having a
way
>>> for
>>> > > > having more control on that. I know that we have the authorizer
>>> used
>>> > for
>>> > > > that (delete on topic) but fine grained control could be better
>>> (even
>>> > > > already happens for topic deletion).
>>> > > >   2.  I know about the problem of restarting broker due to changes
>>> on
>>> > > > policies but what do you mean by doing that on the clients ?
>>> > > >
>>> > > >
>>> > > > Paolo Patierno
>>> > > > Senior Software Engineer (IoT) @ Red Hat
>>> > > > Microsoft MVP on Azure & IoT
>>> > > > Microsoft Azure Advisor
>>> > > >
>>> > > > Twitter : @ppatierno<
>>> > > https://urldefense.proofpoint.com/v2/url?u=http-3A__twitter.
>>> > > com_ppatierno&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=
>>> > > EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ&m=h-D-nA7uiy1Z-jta5y-
>>> > > yh7dKgV77XtsUnJ9Rab1gheY&s=43hzTLEDKw2v5Vh0zwkMTaaKD-
>>> > HdJD8d_F4-Bsw25-Y&e=
>>> > > >
>>> > > > Linkedin : paolopatierno<
>>> > > https://urldefense.proofpoint.com/v2/url?u=http-3A__it.
>>> > > linkedin.com_in_paolopatierno&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=
>>> > > EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ&m=h-D-nA7uiy1Z-jta5y-
>>> > > yh7dKgV77XtsUnJ9Rab1gheY&s=Ig0N7Nwf9EHfTJ2pH3jRM1JIdlzXw6
>>> > R5Drocu0TMRLk&e=
>>> > > >
>>> > > > Blog : DevExperience<
>>> > > https://urldefense.proofpoint.com/v2/url?u=http-3A__
>>> > > paolopatierno.wordpress.com_&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=
>>> > > EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ&m=h-D-nA7uiy1Z-jta5y-
>>> > > yh7dKgV77XtsUnJ9Rab1gheY&s=Tc9NrTtG2GP7-zRjOHkXHfYI0rncO8_
>>> > jKpedna692z4&e=
>>> > > >
>>> > > >
>>> > > >
>>> > > > ________________________________
>>> > > > From: ismaelj@gmail.com <ismaelj@gmail.com> on behalf of
Ismael
>>> Juma <
>>> > > > ismael@juma.me.uk>
>>> > > > Sent: Wednesday, September 27, 2017 10:30 AM
>>> > > > To: dev@kafka.apache.org
>>> > > > Subject: Re: [DISCUSS] KIP-201: Rationalising Policy interfaces
>>> > > >
>>> > > > A couple of questions:
>>> > > >
>>> > > > 1. Is this a concrete requirement from a user or is it
>>> hypothetical?
>>> > > > 2. You sure you would want to do this in the broker instead of
the
>>> > > clients?
>>> > > > It's worth remembering that updating broker policies involves
a
>>> rolling
>>> > > > restart of the cluster, so it's not the right place for things
that
>>> > > change
>>> > > > frequently.
>>> > > >
>>> > > > Ismael
>>> > > >
>>> > > > On Wed, Sep 27, 2017 at 11:26 AM, Paolo Patierno <
>>> ppatierno@live.com>
>>> > > > wrote:
>>> > > >
>>> > > > > Hi Ismael,
>>> > > > >
>>> > > > > regarding motivations for delete records, as I said during
the
>>> > > discussion
>>> > > > > on KIP-204, it gives the possibility to avoid deleting messages
>>> for
>>> > > > > specific partitions (inside the topic) and starting from
a
>>> specific
>>> > > > offset.
>>> > > > > I could think on some users solutions where they know exactly
>>> what
>>> > the
>>> > > > > partitions means in a specific topic (because they are using
a
>>> custom
>>> > > > > partitioner on the producer side) so they know what kind
of
>>> messages
>>> > > are
>>> > > > > inside a partition allowing to delete them but not the others.
>>> In
>>> > > such a
>>> > > > > policy a user could also check the timestamp related to the
>>> offset
>>> > for
>>> > > > > allowing or not deletion on time base.
>>> > > > >
>>> > > > >
>>> > > > > Paolo Patierno
>>> > > > > Senior Software Engineer (IoT) @ Red Hat
>>> > > > > Microsoft MVP on Azure & IoT
>>> > > > > Microsoft Azure Advisor
>>> > > > >
>>> > > > > Twitter : @ppatierno<
>>> > > https://urldefense.proofpoint.com/v2/url?u=http-3A__twitter.
>>> > > com_ppatierno&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=
>>> > > EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ&m=h-D-nA7uiy1Z-jta5y-
>>> > > yh7dKgV77XtsUnJ9Rab1gheY&s=43hzTLEDKw2v5Vh0zwkMTaaKD-
>>> > HdJD8d_F4-Bsw25-Y&e=
>>> > > >
>>> > > > > Linkedin : paolopatierno<
>>> > > https://urldefense.proofpoint.com/v2/url?u=http-3A__it.
>>> > > linkedin.com_in_paolopatierno&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=
>>> > > EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ&m=h-D-nA7uiy1Z-jta5y-
>>> > > yh7dKgV77XtsUnJ9Rab1gheY&s=Ig0N7Nwf9EHfTJ2pH3jRM1JIdlzXw6
>>> > R5Drocu0TMRLk&e=
>>> > > >
>>> > > > > Blog : DevExperience<
>>> > > https://urldefense.proofpoint.com/v2/url?u=http-3A__
>>> > > paolopatierno.wordpress.com_&d=DwIBaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=
>>> > > EzRhmSah4IHsUZVekRUIINhltZK7U0OaeRo7hgW4_tQ&m=h-D-nA7uiy1Z-jta5y-
>>> > > yh7dKgV77XtsUnJ9Rab1gheY&s=Tc9NrTtG2GP7-zRjOHkXHfYI0rncO8_
>>> > jKpedna692z4&e=
>>> > > >
>>> > > > >
>>> > > > >
>>> > > > > ________________________________
>>> > > > > From: ismaelj@gmail.com <ismaelj@gmail.com> on behalf
of Ismael
>>> > Juma <
>>> > > > > ismael@juma.me.uk>
>>> > > > > Sent: Wednesday, September 27, 2017 10:18 AM
>>> > > > > To: dev@kafka.apache.org
>>> > > > > Subject: Re: [DISCUSS] KIP-201: Rationalising Policy interfaces
>>> > > > >
>>> > > > > A couple more comments:
>>> > > > >
>>> > > > > 1. "If this KIP is accepted for Kafka 1.1.0 this removal
could
>>> happen
>>> > > in
>>> > > > > Kafka 1.2.0 or a later release." -> we only remove code
in major
>>> > > > releases.
>>> > > > > So, if it's deprecated in 1.1.0, it would be removed in 2.0.0.
>>> > > > >
>>> > > > > 2. Deleting all messages in a topic is not really the same
as
>>> > deleting
>>> > > a
>>> > > > > topic. The latter will cause consumers and producers to error
out
>>> > > while
>>> > > > the
>>> > > > > former will not. It would be good to motivate the need for
the
>>> delete
>>> > > > > records policy more.
>>> > > > >
>>> > > > > Ismael
>>> > > > >
>>> > > > > On Wed, Sep 27, 2017 at 11:12 AM, Ismael Juma <ismael@juma.me.uk
>>> >
>>> > > wrote:
>>> > > > >
>>> > > > > > Another quick comment: the KIP states that having multiple
>>> > > interfaces
>>> > > > > > imply that the logic must be in 2 places. That is not
true
>>> because
>>> > > the
>>> > > > > same
>>> > > > > > class can implement multiple interfaces (this aspect
was
>>> considered
>>> > > > when
>>> > > > > we
>>> > > > > > decided to introduce policies incrementally).
>>> > > > > >
>>> > > > > > The main reason why I think the original approach doesn't
work
>>> well
>>> > > is
>>> > > > > > that there is no direct mapping between an operation
and the
>>> > policy.
>>> > > > That
>>> > > > > > is, we initially thought we would have create/alter/delete
>>> topics,
>>> > > but
>>> > > > > that
>>> > > > > > didn't work out as the alter case is better served by
multiple
>>> > > request
>>> > > > > > types. Given that, it's a bit awkward to maintain the
original
>>> > > approach
>>> > > > > and
>>> > > > > > a policy for topic management seemed easier to understand.
On
>>> that
>>> > > > note,
>>> > > > > > would `TopicManagementPolicy` be a better name?
>>> > > > > >
>>> > > > > > Looking at the updated KIP, I notice that we actually
have a
>>> > > > > > TopicDeletionPolicy with a separate config. That seems
to be a
>>> > > halfway
>>> > > > > > house. Not sure about that.
>>> > > > > >
>>> > > > > > Ismael
>>> > > > > >
>>> > > > > > On Wed, Sep 27, 2017 at 10:47 AM, Tom Bentley
>>> > > <t.j.bentley@gmail.com>
>>> > > > > > wrote:
>>> > > > > >
>>> > > > > >> I have updated the KIP to add a common policy interface
for
>>> topic
>>> > > and
>>> > > > > >> message deletion. This included pulling ClusterState
and
>>> > TopicState
>>> > > > > >> interfaces up to the top level so that they can
be shared
>>> between
>>> > > the
>>> > > > > two
>>> > > > > >> policies.
>>> > > > > >>
>>> > > > > >> Cheers,
>>> > > > > >>
>>> > > > > >> Tom
>>> > > > > >>
>>> > > > > >> On 26 September 2017 at 18:09, Edoardo Comar <
>>> ECOMAR@uk.ibm.com>
>>> > > > wrote:
>>> > > > > >>
>>> > > > > >> > Thanks Tom,
>>> > > > > >> > In my original KIP-170 I mentioned that the
method
>>> > > > > >> >
>>> > > > > >> > public Map<String, Integer> topicsPartitionCount();
>>> > > > > >> >
>>> > > > > >> > was just a starting point for a general purpose
ClusterState
>>> > > > > >> > as it happened to be exactly the info we needed
for our
>>> policy
>>> > > > > >> > implementation :-)
>>> > > > > >> > it definitely doesn't feel general purpose
enough.
>>> > > > > >> >
>>> > > > > >> > what about
>>> > > > > >> >
>>> > > > > >> >     interface ClusterState {
>>> > > > > >> >         public TopicState topicState(String
topicName);
>>> > > > > >> >         public Set<String> topics();
>>> > > > > >> >     }
>>> > > > > >> >
>>> > > > > >> > I think that the implementation of ClusterState
that the
>>> server
>>> > > will
>>> > > > > >> pass
>>> > > > > >> > to the policy.validate method
>>> > > > > >> > would just lazily tap into MetadataCache. No
need for big
>>> > upfront
>>> > > > > >> > allocations.
>>> > > > > >> >
>>> > > > > >> > ciao,
>>> > > > > >> > Edo
>>> > > > > >> > --------------------------------------------------
>>> > > > > >> >
>>> > > > > >> > Edoardo Comar
>>> > > > > >> >
>>> > > > > >> > IBM Message Hub
>>> > > > > >> >
>>> > > > > >> > IBM UK Ltd, Hursley Park, SO21 2JN
>>> > > > > >> >
>>> > > > > >> >
>>> > > > > >> >
>>> > > > > >> > From:   Tom Bentley <t.j.bentley@gmail.com>
>>> > > > > >> > To:     dev@kafka.apache.org
>>> > > > > >> > Date:   26/09/2017 17:39
>>> > > > > >> > Subject:        Re: [DISCUSS] KIP-201: Rationalising
Policy
>>> > > > interfaces
>>> > > > > >> >
>>> > > > > >> >
>>> > > > > >> >
>>> > > > > >> > Hi Edoardo,
>>> > > > > >> >
>>> > > > > >> >
>>> > > > > >> > what about a single method in ClusterState
>>> > > > > >> > >
>>> > > > > >> > >     interface ClusterState {
>>> > > > > >> > >         public Map<String,TopicState>
topicsState();
>>> > > > > >> > >
>>> > > > > >> > >     }
>>> > > > > >> > >
>>> > > > > >> > > which could return a read-only snapshot
of the cluster
>>> > metadata
>>> > > ?
>>> > > > > >> > >
>>> > > > > >> >
>>> > > > > >> > Sure that would work too. A concern with that
is that we
>>> end up
>>> > > > > >> allocating
>>> > > > > >> > a potentially rather large amount for the Map
and the
>>> > collections
>>> > > > > >> present
>>> > > > > >> > in the TopicStates in order to provide the
snapshot. The
>>> caller
>>> > > > might
>>> > > > > >> only
>>> > > > > >> > be interested in one item from the TopicState
for one topic
>>> in
>>> > > the
>>> > > > > map.
>>> > > > > >> > Accessing this information via methods means
the caller only
>>> > pays
>>> > > > for
>>> > > > > >> what
>>> > > > > >> > they use.
>>> > > > > >> >
>>> > > > > >> > Cheers,
>>> > > > > >> >
>>> > > > > >> > Tom
>>> > > > > >> >
>>> > > > > >> >
>>> > > > > >> >
>>> > > > > >> > Unless stated otherwise above:
>>> > > > > >> > IBM United Kingdom Limited - Registered in
England and Wales
>>> > with
>>> > > > > number
>>> > > > > >> > 741598.
>>> > > > > >> > Registered office: PO Box 41, North Harbour,
Portsmouth,
>>> > > Hampshire
>>> > > > PO6
>>> > > > > >> 3AU
>>> > > > > >> >
>>> > > > > >>
>>> > > > > >
>>> > > > > >
>>> > > > >
>>> > > >
>>> > >
>>> > >
>>> > >
>>> > > Unless stated otherwise above:
>>> > > IBM United Kingdom Limited - Registered in England and Wales with
>>> number
>>> > > 741598.
>>> > > Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire
>>> PO6
>>> > 3AU
>>> > >
>>> >
>>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message