kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Colin McCabe <cmcc...@apache.org>
Subject Re: [DISCUSS] KIP-240: AdminClient.listReassignments AdminClient.describeReassignments
Date Tue, 09 Jan 2018 21:18:18 GMT
What if we had an internal topic which watchers could listen to for information about partition
reassignments?  The information could be in JSON, so if we want to add new fields later, we
always could.  

This avoids introducing a new AdminClient API.  For clients that want to be notified about
partition reassignments in a timely fashion, this avoids the "polling an AdminClient API in
a tight loop" antipattern.  It allows watchers to be notified in a simple and natural way
about what is going on.  Access can be controlled by the existing topic ACL mechanisms.

best,
Colin


On Fri, Dec 22, 2017, at 06:48, Tom Bentley wrote:
> Hi Steven,
> 
> I must admit that I didn't really considered that option. I can see how
> attractive it is from your perspective. In practice it would come with lots
> of edge cases which would need to be thought through:
> 
> 1. What happens if the controller can't produce a record to this topic
> because the partitions leader is unavailable?
> 2. One solution to that is for the topic to be replicated on every broker,
> so that the controller could elect itself leader on controller failover.
> But that raises another problem: What if, upon controller failover, the
> controller is ineligible for leader election because it's not in the ISR?
> 3. The above questions suggest the controller might not always be able to
> produce to the topic, but the controller isn't able to control when other
> brokers catch up replicating moved partitions and has to deal with those
> events. The controller would have to record (in memory) that the
> reassignment was complete, but hadn't been published, and publish later,
> when it was able to.
> 4. Further to 3, we would need to recover the in-memory state of
> reassignments on controller failover. But now we have to consider what
> happens if the controller cannot *consume* from the topic.
> 
> This seems pretty complicated to me. I think each of the above points has
> alternatives (or compromises) which might make the problem more tractable,
> so I'd welcome hearing from anyone who has ideas on that. In particular
> there are parallels with consumer offsets which might be worth thinking
> about some more.
> 
> I would be useful it define better the use case we're trying to cater to
> here.
> 
> * Is it just a notification that a given reassignment has finished that
> you're interested in?
> * What are the consequences if such a notification is delayed, or dropped
> entirely?
> 
> Regards,
> 
> Tom
> 
> 
> 
> On 19 December 2017 at 20:34, Steven Aerts <steven.aerts@gmail.com> wrote:
> 
> > Hello Tom,
> >
> >
> > when you were working out KIP-236, did you consider migrating the
> > reassignment
> > state from zookeeper to an internal kafka topic, keyed by partition
> > and log compacted?
> >
> > It would allow an admin client and controller to easily subscribe for
> > those changes,
> > without the need to extend the network protocol as discussed in KIP-240.
> >
> > This is just a theoretical idea I wanted to share, as I can't find a
> > reason why it would
> > be a stupid idea.
> > But I assume that in practice, this will imply too much change to the
> > code base to be
> > viable.
> >
> >
> > Regards,
> >
> >
> >    Steven
> >
> >
> > 2017-12-18 11:49 GMT+01:00 Tom Bentley <t.j.bentley@gmail.com>:
> > > Hi Steven,
> > >
> > > I think it would be useful to be able to subscribe yourself on updates of
> > >> reassignment changes.
> > >
> > >
> > > I agree this would be really useful, but, to the extent I understand the
> > > networking underpinnings of the admin client, it might be difficult to do
> > > well in practice. Part of the problem is that you might "set a watch" (to
> > > borrow the ZK terminology) via one broker (or the controller), only for
> > > that broker to fail (or the controller be re-elected). Obviously you can
> > > detect the loss of connection and set a new watch via a different broker
> > > (or the new controller), but that couldn't be transparent to the user,
> > > because the AdminClient doesn't know what changed while it was
> > > disconnected/not watching.
> > >
> > > Another issue is that to avoid races you really need to combine fetching
> > > the current state with setting the watch (as is done in the native
> > > ZooKeeper API). I think there are lots of subtle issues of this sort
> > which
> > > would need to be addressed to make something reliable.
> > >
> > > In the mean time, ZooKeeper already has a (proven and mature) API for
> > > watches, so there is, in principle, a good workaround. I say "in
> > principle"
> > > because in the KIP-236 proposal right now the /admin/reassign_partitions
> > > znode is legacy and the reassignment is represented by
> > > /admin/reassigments/$topic/$partition. That naming scheme for the znode
> > > would make it harder for ZooKeeper clients like yours because such
> > clients
> > > would need to set a child watch per topic. The original proposal for the
> > > naming scheme was /admin/reassigments/$topic-$partition, which would
> > mean
> > > clients like yours would need only 1 child watch. The advantage of
> > > /admin/reassigments/$topic/$partition is it scales better. I don't
> > > currently know how well ZooKeeper copes with nodes with many children, so
> > > it's difficult for me weigh those two options, but I would be happy to
> > > switch back to /admin/reassigments/$topic-$partition if we could
> > reassure
> > > ourselves it would scale OK to the reassignment sizes would people need
> > in
> > > practice.
> > >
> > > Overall I would prefer not to tackle something like this in *this* KIP,
> > > though it could be something for a future KIP. Of course I'm happy to
> > hear
> > > more discussion about this too!
> > >
> > > Cheers,
> > >
> > > Tom
> > >
> > >
> > > On 15 December 2017 at 18:51, Steven Aerts <steven.aerts@gmail.com>
> > wrote:
> > >
> > >> Tom,
> > >>
> > >>
> > >> I think it would be useful to be able to subscribe yourself on updates
> > of
> > >> reassignment changes.
> > >> Our internal kafka supervisor and monitoring tools are currently
> > subscribed
> > >> to these changes in zookeeper so they can babysit our clusters.
> > >>
> > >> I think it would be nice if we could receive these events through the
> > >> adminclient.
> > >> In the api proposal, you can only poll for changes.
> > >>
> > >> No clue how difficult it would be to implement, maybe you can piggyback
> > on
> > >> some version number in the repartition messages or on zookeeper.
> > >>
> > >> This is just an idea, not a must have feature for me.  We can always
> > poll
> > >> over
> > >> the proposed api.
> > >>
> > >>
> > >> Regards,
> > >>
> > >>
> > >>    Steven
> > >>
> > >>
> > >> Op vr 15 dec. 2017 om 19:16 schreef Tom Bentley <t.j.bentley@gmail.com
> > >:
> > >>
> > >> > Hi,
> > >> >
> > >> > KIP-236 lays the foundations for AdminClient APIs to do with partition
> > >> > reassignment. I'd now like to start discussing KIP-240, which adds
> > APIs
> > >> to
> > >> > the AdminClient to list and describe the current reassignments.
> > >> >
> > >> >
> > >> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > >> 240%3A+AdminClient.listReassignments+AdminClient.describeReassignments
> > >> >
> > >> > Aside: I have fairly developed ideas for the API for starting a
> > >> > reassignment, but I intend to put that in a third KIP.
> > >> >
> > >> > Cheers,
> > >> >
> > >> > Tom
> > >> >
> > >>
> >

Mime
View raw message