kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ismael Juma <isma...@gmail.com>
Subject Re: [DISCUSS] KIP-435: Incremental Partition Reassignment
Date Mon, 08 Apr 2019 00:08:47 GMT
Good discussion about where we should do batching. I think if there is a
clear great way to batch, then it makes a lot of sense to just do it once.
However, if we think there is scope for experimenting with different
approaches, then an API that tools can use makes a lot of sense. They can
experiment and innovate. Eventually, we can integrate something into Kafka
if it makes sense.

Ismael

On Sun, Apr 7, 2019, 11:03 PM Colin McCabe <cmccabe@apache.org> wrote:

> Hi George,
>
> As Jason was saying, it seems like there are two directions we could go
> here: an external system handling batching, and the controller handling
> batching.  I think the controller handling batching would be better, since
> the controller has more information about the state of the system.  If the
> controller handles batching, then the controller could also handle things
> like setting up replication quotas for individual partitions.  The
> controller could do things like throttle replication down if the cluster
> was having problems.
>
> We kind of need to figure out which way we're going to go on this one
> before we set up big new APIs, I think.  If we want an external system to
> handle batching, then we can keep the idea that there is only one
> reassignment in progress at once.  If we want the controller to handle
> batching, we will need to get away from that idea.  Instead, we should just
> have a bunch of "ideal assignments" that we tell the controller about, and
> let it decide how to do the batching.  These ideal assignments could change
> continuously over time, so from the admin's point of view, there would be
> no start/stop/cancel, but just individual partition reassignments that we
> submit, perhaps over a long period of time.  And then cancellation might
> just mean cancelling just that individual partition reassignment, not all
> partition reassignments.
>
> best,
> Colin
>
> On Fri, Apr 5, 2019, at 19:34, George Li wrote:
> >  Hi Jason / Viktor,
> >
> > For the URP during a reassignment,  if the "original_replicas" is kept
> > for the current pending reassignment. I think it will be very easy to
> > compare that with the topic/partition's ISR.  If all
> > "original_replicas" are in ISR, then URP should be 0 for that
> > topic/partition.
> >
> > It would be also nice to separate the metrics MaxLag/TotalLag for
> > Reassignments. I think that will also require "original_replicas" (the
> > topic/partition's replicas just before reassignment when the AR
> > (Assigned Replicas) is set to Set(original_replicas) +
> > Set(new_replicas_in_reassign_partitions) ).
> >
> > Thanks,
> > George
> >
> >     On Friday, April 5, 2019, 6:29:55 PM PDT, Jason Gustafson
> > <jason@confluent.io> wrote:
> >
> >  Hi Viktor,
> >
> > Thanks for writing this up. As far as questions about overlap with
> KIP-236,
> > I agree it seems mostly orthogonal. I think KIP-236 may have had a larger
> > initial scope, but now it focuses on cancellation and batching is left
> for
> > future work.
> >
> > With that said, I think we may not actually need a KIP for the current
> > proposal since it doesn't change any APIs. To make it more generally
> > useful, however, it would be nice to handle batching at the partition
> level
> > as well as Jun suggests. The basic question is at what level should the
> > batching be determined. You could rely on external processes (e.g. cruise
> > control) or it could be built into the controller. There are tradeoffs
> > either way, but I think it simplifies such tools if it is handled
> > internally. Then it would be much safer to submit a larger reassignment
> > even just using the simple tools that come with Kafka.
> >
> > By the way, since you are looking into some of the reassignment logic,
> > another problem that we might want to address is the misleading way we
> > report URPs during a reassignment. I had a naive proposal for this
> > previously, but it didn't really work
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-352%3A+Distinguish+URPs+caused+by+reassignment
> .
> > Potentially fixing that could fall under this work as well if you think
> > it
> > makes sense.
> >
> > Best,
> > Jason
> >
> > On Thu, Apr 4, 2019 at 4:49 PM Jun Rao <jun@confluent.io> wrote:
> >
> > > Hi, Viktor,
> > >
> > > Thanks for the KIP. A couple of comments below.
> > >
> > > 1. Another potential thing to do reassignment incrementally is to move
> a
> > > batch of partitions at a time, instead of all partitions. This may
> lead to
> > > less data replication since by the time the first batch of partitions
> have
> > > been completely moved, some data of the next batch may have been
> deleted
> > > due to retention and doesn't need to be replicated.
> > >
> > > 2. "Update CR in Zookeeper with TR for the given partition". Which ZK
> path
> > > is this for?
> > >
> > > Jun
> > >
> > > On Sat, Feb 23, 2019 at 2:12 AM Viktor Somogyi-Vass <
> > > viktorsomogyi@gmail.com>
> > > wrote:
> > >
> > > > Hi Harsha,
> > > >
> > > > As far as I understand KIP-236 it's about enabling reassignment
> > > > cancellation and as a future plan providing a queue of replica
> > > reassignment
> > > > steps to allow manual reassignment chains. While I agree that the
> > > > reassignment chain has a specific use case that allows fine grain
> control
> > > > over reassignment process, My proposal on the other hand doesn't talk
> > > about
> > > > cancellation but it only provides an automatic way to incrementalize
> an
> > > > arbitrary reassignment which I think fits the general use case where
> > > users
> > > > don't want that level of control but still would like a balanced way
> of
> > > > reassignments. Therefore I think it's still relevant as an
> improvement of
> > > > the current algorithm.
> > > > Nevertheless I'm happy to add my ideas to KIP-236 as I think it
> would be
> > > a
> > > > great improvement to Kafka.
> > > >
> > > > Cheers,
> > > > Viktor
> > > >
> > > > On Fri, Feb 22, 2019 at 5:05 PM Harsha <kafka@harsha.io> wrote:
> > > >
> > > > > Hi Viktor,
> > > > >            There is already KIP-236 for the same feature and George
> > > made
> > > > > a PR for this as well.
> > > > > Lets consolidate these two discussions. If you have any cases that
> are
> > > > not
> > > > > being solved by KIP-236 can you please mention them in that
> thread. We
> > > > can
> > > > > address as part of KIP-236.
> > > > >
> > > > > Thanks,
> > > > > Harsha
> > > > >
> > > > > On Fri, Feb 22, 2019, at 5:44 AM, Viktor Somogyi-Vass wrote:
> > > > > > Hi Folks,
> > > > > >
> > > > > > I've created a KIP about an improvement of the reassignment
> algorithm
> > > > we
> > > > > > have. It aims to enable partition-wise incremental reassignment.
> The
> > > > > > motivation for this is to avoid excess load that the current
> > > > replication
> > > > > > algorithm implicitly carries as in that case there are points
in
> the
> > > > > > algorithm where both the new and old replica set could be online
> and
> > > > > > replicating which puts double (or almost double) pressure on
the
> > > > brokers
> > > > > > which could cause problems.
> > > > > > Instead my proposal would slice this up into several steps where
> each
> > > > > step
> > > > > > is calculated based on the final target replicas and the current
> > > > replica
> > > > > > assignment taking into account scenarios where brokers could
be
> > > offline
> > > > > and
> > > > > > when there are not enough replicas to fulfil the
> min.insync.replica
> > > > > > requirement.
> > > > > >
> > > > > > The link to the KIP:
> > > > > >
> > > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-435%3A+Incremental+Partition+Reassignment
> > > > > >
> > > > > > I'd be happy to receive any feedback.
> > > > > >
> > > > > > An important note is that this KIP and another one, KIP-236
that
> is
> > > > > > about
> > > > > > interruptible reassignment (
> > > > > >
> > > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-236%3A+Interruptible+Partition+Reassignment
> > > > > )
> > > > > > should be compatible.
> > > > > >
> > > > > > Thanks,
> > > > > > Viktor
> > > > > >
> > > > >
> > > >
> > >
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message