kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Konstantine Karantasis <konstant...@confluent.io>
Subject Re: [EXTERNAL] Incremental Cooperative Rebalancing
Date Tue, 16 Oct 2018 18:50:44 GMT
Matthias, Ryanne, Rhys, Guozhang, thank you all for your comments!

Ryanne, to try to address your specific comments, let me start by saying
that a key concept behind this proposal is the concept of overlapping
'communication' with 'computation', which is known for often reducing the
overall cost (or latency if you prefer) of an operation that involves
multiple processes compared to global barrier-type synchronization. Of
course this does not rule out that there might be occasions where
stop-the-world might incur smaller overall cost. But given that we'll
always want to minimize communication and shuffling of resources and apply
more sticky heuristics in assignments, I believe that such edge cases will
be considerably fewer in practice.

Specifically here, the goal is to exchange (by revoking and reassigning)
only the necessary resources in the group and allow for the unaffected
resources to continue being used. As mentioned in the motivation section,
this is expected to have a positive effect in a number of use cases, for
which stop-the-world is too strict. Given this key distinction between
affected and unaffected resources (e.g. topic partitions, tasks in Kafka
Connect etc) I anticipate that in most cases, even for resources that need
to change hands, the overall rebalance phase will be faster (especially at
larger scale) than it is today with all the processes participating in
resource hand-off and re-assignment.

-Konstantine


On Fri, Oct 5, 2018 at 12:00 PM Guozhang Wang <wangguoz@gmail.com> wrote:

> Hello Konstantine,
>
> Thanks for the great write-up! Here are a few quick comments I have about
> the proposals:
>
> 1. For "Kubernetes process death" and "Rolling bounce" case, there is
> another parallel work on KIP-345 [1] (cc'ed contributor) that is aimed to
> mitigate these two issues, but it is relying on the fact that we can
> disable sending "leave group" request immediately on shutting down. Ideally
> if KIP-345 works well for these cases, then Simple Cooperative Rebalancing
> itself along with KIP-345 should cover most of the scenarios we've
> described in the wiki. In addition, Delayed / Incremental Imbalance
> approach can be done incrementally on top of Simple approach, so execution
> wise I'd suggest we start with the Simple approach and observe how well it
> works in practice (especially with K8s etc frameworks) before deciding if
> we should go further and implemented the more complicated ones.
>
> 2. For the "events" section, I think it may worth mentioning if there are
> any new client / coordinator failure events that need to be addressed with
> the new protocol, as we listed in the original design [2] [3]. For example,
> what if the leader received different client or resource listings during
> two consecutive rebalances?
>
> 3. It's worth mentioning what are the key ideas in the updated protocol:
>
> 3.a) In the original protocol we require every member to revoke every
> resource before joining the group, which can then be used as the
> "synchronization barrier" and hence it does not matter for clients to
> receive assignment at different point in time; in the new protocol we do
> not require members to revoke everything, but instead leveraging on the
> leader who has the "global picture" to make sure that there are no
> conflicts between those shared resources, a.k.a as the synchronization
> barrier.
> 3.b) The new fields in the Assigned / RevokedPartitions fields in the
> responses are now "deltas" instead of "overwrites" to the consumers. Any
> modules relying on it, e.g. Streams who relies on ConsumerCoordinator,
> needs to adjust their code (PartitionAssignor) correspondingly to
> incorporate this semantic changes.
>
> 4. I've added a child page under yours for illustrating the implications
> for Streams on rebalance cost reduction [4], since for Streams one key
> characteristics is that standby tasks exist to help with rebalance incurred
> unavailability, and hence need to be considered upfront how Streams should
> leverage on the new protocol along with standby tasks to achieve the better
> operational goals during rebalances.
>
>
> [1]
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-345%3A+Reduce+multiple+consumer+rebalances+by+specifying+member+id
> [2]
>
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Client-side+Assignment+Proposal#KafkaClient-sideAssignmentProposal-CoordinatorStateMachine
> [3]
>
> https://cwiki.apache.org/confluence/display/KAFKA/Kafka+0.9+Consumer+Rewrite+Design#Kafka0.9ConsumerRewriteDesign-Interestingscenariostoconsider
> [4]
>
> https://cwiki.apache.org/confluence/display/KAFKA/Incremental+Cooperative+Rebalancing+for+Streams
>
>
> On Thu, Oct 4, 2018 at 12:16 PM, McCaig, Rhys <Rhys_McCaig@comcast.com>
> wrote:
>
> > This is fantastic. Im really excited to see the work on this.
> >
> > > On Oct 2, 2018, at 4:22 PM, Konstantine Karantasis <
> > konstantine@confluent.io> wrote:
> > >
> > > Hey everyone,
> > >
> > > I'd like to bring to your attention a general design document that was
> > just
> > > published in Apache Kafka's wiki space:
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/Incrementa
> > l+Cooperative+Rebalancing%3A+Support+and+Policies
> > >
> > > It deals with the subject of Rebalancing of groups in Kafka and
> proposes
> > > basic infrastructure to support improvements on the current rebalancing
> > > protocol as well as a set of policies that can be implemented to
> optimize
> > > rebalancing under a number of real-world scenarios.
> > >
> > > Currently, this wiki page is meant to serve as a reference to the
> > > proposition of Incremental Cooperative Rebalancing overall. Specific
> KIPs
> > > will follow in order to describe in more detail - using the standard
> KIP
> > > format - the basic infrastructure and the first policies that will be
> > > proposed for implementation in components such as Connect, the Kafka
> > > Consumer and Streams.
> > >
> > > Stay tuned!
> > > Konstantine
> >
> >
>
>
> --
> -- Guozhang
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message