kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gwen Shapira <g...@confluent.io>
Subject Re: [Discuss] KIP-389: Enforce group.max.size to cap member metadata growth
Date Wed, 02 Jan 2019 18:59:49 GMT
Sorry for joining the fun late, but I think the problem we are solving
evolved a bit in the thread, and I'd like to have better understanding
of the problem before voting :)

Both KIP and discussion assert that large groups are a problem, but
they are kinda inconsistent regarding why they are a problem and whose
problem they are...
1. The KIP itself states that the main issue with large groups are
long rebalance times. Per my understanding, this is mostly a problem
for the application that consumes data, but not really a problem for
the brokers themselves, so broker admins probably don't and shouldn't
care about it. Also, my understanding is that this is a problem for
consumer groups, but not necessarily a problem for other group types.
2. The discussion highlights the issue of "run away" groups that
essentially create tons of members needlessly and use up lots of
broker memory. This is something the broker admins will care about a
lot. And is also a problem for every group that uses coordinators and
not just consumers. And since the memory in question is the metadata
cache, it probably has the largest impact on Kafka Streams
applications, since they have lots of metadata.

The solution proposed makes the most sense in the context of #2, so
perhaps we should update the motivation section of the KIP to reflect
that.

The reason I'm probing here is that in my opinion we have to give our
users some guidelines on what a reasonable limit is (otherwise, how
will they know?). Calculating the impact of group-size on rebalance
time in order to make good recommendations will take a significant
effort. On the other hand, informing users regarding the memory
footprint of a consumer in a group and using that to make a reasonable
suggestion isn't hard.

Gwen


On Sun, Dec 30, 2018 at 12:51 PM Stanislav Kozlovski
<stanislav@confluent.io> wrote:
>
> Thanks Boyang,
>
> If there aren't any more thoughts on the KIP I'll start a vote thread in
> the new year
>
> On Sat, Dec 29, 2018 at 12:58 AM Boyang Chen <bchen11@outlook.com> wrote:
>
> > Yep Stanislav, that's what I'm proposing, and your explanation makes sense.
> >
> > Boyang
> >
> > ________________________________
> > From: Stanislav Kozlovski <stanislav@confluent.io>
> > Sent: Friday, December 28, 2018 7:59 PM
> > To: dev@kafka.apache.org
> > Subject: Re: [Discuss] KIP-389: Enforce group.max.size to cap member
> > metadata growth
> >
> > Hey there everybody, let's work on wrapping this discussion up.
> >
> > @Boyang, could you clarify what you mean by
> > > One more question is whether you feel we should enforce group size cap
> > statically or on runtime?
> > Is that related to the option of enabling this config via the dynamic
> > broker config feature?
> >
> > Regarding that - I feel it's useful to have and I also think it might not
> > introduce additional complexity. ├ůs long as we handle the config being
> > changed midway through a rebalance (via using the old value) we should be
> > good to go.
> >
> > On Wed, Dec 12, 2018 at 4:12 PM Stanislav Kozlovski <
> > stanislav@confluent.io>
> > wrote:
> >
> > > Hey Jason,
> > >
> > > Yes, that is what I meant by
> > > > Given those constraints, I think that we can simply mark the group as
> > > `PreparingRebalance` with a rebalanceTimeout of the server setting `
> > > group.max.session.timeout.ms`. That's a bit long by default (5 minutes)
> > > but I can't seem to come up with a better alternative
> > > So either the timeout or all members calling joinGroup, yes
> > >
> > >
> > > On Tue, Dec 11, 2018 at 8:14 PM Boyang Chen <bchen11@outlook.com> wrote:
> > >
> > >> Hey Jason,
> > >>
> > >> I think this is the correct understanding. One more question is whether
> > >> you feel
> > >> we should enforce group size cap statically or on runtime?
> > >>
> > >> Boyang
> > >> ________________________________
> > >> From: Jason Gustafson <jason@confluent.io>
> > >> Sent: Tuesday, December 11, 2018 3:24 AM
> > >> To: dev
> > >> Subject: Re: [Discuss] KIP-389: Enforce group.max.size to cap member
> > >> metadata growth
> > >>
> > >> Hey Stanislav,
> > >>
> > >> Just to clarify, I think what you're suggesting is something like this
> > in
> > >> order to gracefully shrink the group:
> > >>
> > >> 1. Transition the group to PREPARING_REBALANCE. No members are kicked
> > out.
> > >> 2. Continue to allow offset commits and heartbeats for all current
> > >> members.
> > >> 3. Allow the first n members that send JoinGroup to stay in the group,
> > but
> > >> wait for the JoinGroup (or session timeout) from all active members
> > before
> > >> finishing the rebalance.
> > >>
> > >> So basically we try to give the current members an opportunity to finish
> > >> work, but we prevent some of them from rejoining after the rebalance
> > >> completes. It sounds reasonable if I've understood correctly.
> > >>
> > >> Thanks,
> > >> Jason
> > >>
> > >>
> > >>
> > >> On Fri, Dec 7, 2018 at 6:47 AM Boyang Chen <bchen11@outlook.com> wrote:
> > >>
> > >> > Yep, LGTM on my side. Thanks Stanislav!
> > >> > ________________________________
> > >> > From: Stanislav Kozlovski <stanislav@confluent.io>
> > >> > Sent: Friday, December 7, 2018 8:51 PM
> > >> > To: dev@kafka.apache.org
> > >> > Subject: Re: [Discuss] KIP-389: Enforce group.max.size to cap member
> > >> > metadata growth
> > >> >
> > >> > Hi,
> > >> >
> > >> > We discussed this offline with Boyang and figured that it's best to
> > not
> > >> > wait on the Cooperative Rebalancing proposal. Our thinking is that we
> > >> can
> > >> > just force a rebalance from the broker, allowing consumers to commit
> > >> > offsets if their rebalanceListener is configured correctly.
> > >> > When rebalancing improvements are implemented, we assume that they
> > would
> > >> > improve KIP-389's behavior as well as the normal rebalance scenarios
> > >> >
> > >> > On Wed, Dec 5, 2018 at 12:09 PM Boyang Chen <bchen11@outlook.com>
> > >> wrote:
> > >> >
> > >> > > Hey Stanislav,
> > >> > >
> > >> > > thanks for the question! `Trivial rebalance` means "we don't start
> > >> > > reassignment right now, but you need to know it's coming soon
> > >> > > and you should start preparation".
> > >> > >
> > >> > > An example KStream use case is that before actually starting to
> > shrink
> > >> > the
> > >> > > consumer group, we need to
> > >> > > 1. partition the consumer group into two subgroups, where one will
> > be
> > >> > > offline soon and the other will keep serving;
> > >> > > 2. make sure the states associated with near-future offline
> > consumers
> > >> are
> > >> > > successfully replicated on the serving ones.
> > >> > >
> > >> > > As I have mentioned shrinking the consumer group is pretty much
> > >> > equivalent
> > >> > > to group scaling down, so we could think of this
> > >> > > as an add-on use case for cluster scaling. So my understanding is
> > that
> > >> > the
> > >> > > KIP-389 could be sequenced within our cooperative rebalancing<
> > >> > >
> > >> >
> > >>
> > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FIncremental%2BCooperative%2BRebalancing%253A%2BSupport%2Band%2BPolicies&amp;data=02%7C01%7C%7Cb603e099d6c744d8fac708d65ed51d03%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636800666735874264&amp;sdata=BX4DHEX1OMgfVuBOREwSjiITu5aV83Q7NAz77w4avVc%3D&amp;reserved=0
> > >> > > >
> > >> > > proposal.
> > >> > >
> > >> > > Let me know if this makes sense.
> > >> > >
> > >> > > Best,
> > >> > > Boyang
> > >> > > ________________________________
> > >> > > From: Stanislav Kozlovski <stanislav@confluent.io>
> > >> > > Sent: Wednesday, December 5, 2018 5:52 PM
> > >> > > To: dev@kafka.apache.org
> > >> > > Subject: Re: [Discuss] KIP-389: Enforce group.max.size to cap member
> > >> > > metadata growth
> > >> > >
> > >> > > Hey Boyang,
> > >> > >
> > >> > > I think we still need to take care of group shrinkage because even
> > if
> > >> > users
> > >> > > change the config value we cannot guarantee that all consumer groups
> > >> > would
> > >> > > have been manually shrunk.
> > >> > >
> > >> > > Regarding 2., I agree that forcefully triggering a rebalance might
> > be
> > >> the
> > >> > > most intuitive way to handle the situation.
> > >> > > What does a "trivial rebalance" mean? Sorry, I'm not familiar with
> > the
> > >> > > term.
> > >> > > I was thinking that maybe we could force a rebalance, which would
> > >> cause
> > >> > > consumers to commit their offsets (given their rebalanceListener is
> > >> > > configured correctly) and subsequently reject some of the incoming
> > >> > > `joinGroup` requests. Does that sound like it would work?
> > >> > >
> > >> > > On Wed, Dec 5, 2018 at 1:13 AM Boyang Chen <bchen11@outlook.com>
> > >> wrote:
> > >> > >
> > >> > > > Hey Stanislav,
> > >> > > >
> > >> > > > I read the latest KIP and saw that we already changed the default
> > >> value
> > >> > > to
> > >> > > > -1. Do
> > >> > > > we still need to take care of the consumer group shrinking when
> > >> doing
> > >> > the
> > >> > > > upgrade?
> > >> > > >
> > >> > > > However this is an interesting topic that worth discussing.
> > Although
> > >> > > > rolling
> > >> > > > upgrade is fine, `consumer.group.max.size` could always have
> > >> conflict
> > >> > > with
> > >> > > > the current
> > >> > > > consumer group size which means we need to adhere to one source of
> > >> > truth.
> > >> > > >
> > >> > > > 1.Choose the current group size, which means we never interrupt
> > the
> > >> > > > consumer group until
> > >> > > > it transits to PREPARE_REBALANCE. And we keep track of how many
> > join
> > >> > > group
> > >> > > > requests
> > >> > > > we have seen so far during PREPARE_REBALANCE. After reaching the
> > >> > consumer
> > >> > > > cap,
> > >> > > > we start to inform over provisioned consumers that you should send
> > >> > > > LeaveGroupRequest and
> > >> > > > fail yourself. Or with what Mayuresh proposed in KIP-345, we could
> > >> mark
> > >> > > > extra members
> > >> > > > as hot backup and rebalance without them.
> > >> > > >
> > >> > > > 2.Choose the `consumer.group.max.size`. I feel incremental
> > >> rebalancing
> > >> > > > (you proposed) could be of help here.
> > >> > > > When a new cap is enforced, leader should be notified. If the
> > >> current
> > >> > > > group size is already over limit, leader
> > >> > > > shall trigger a trivial rebalance to shuffle some topic partitions
> > >> and
> > >> > > let
> > >> > > > a subset of consumers prepare the ownership
> > >> > > > transition. Until they are ready, we trigger a real rebalance to
> > >> remove
> > >> > > > over-provisioned consumers. It is pretty much
> > >> > > > equivalent to `how do we scale down the consumer group without
> > >> > > > interrupting the current processing`.
> > >> > > >
> > >> > > > I personally feel inclined to 2 because we could kill two birds
> > with
> > >> > one
> > >> > > > stone in a generic way. What do you think?
> > >> > > >
> > >> > > > Boyang
> > >> > > > ________________________________
> > >> > > > From: Stanislav Kozlovski <stanislav@confluent.io>
> > >> > > > Sent: Monday, December 3, 2018 8:35 PM
> > >> > > > To: dev@kafka.apache.org
> > >> > > > Subject: Re: [Discuss] KIP-389: Enforce group.max.size to cap
> > member
> > >> > > > metadata growth
> > >> > > >
> > >> > > > Hi Jason,
> > >> > > >
> > >> > > > > 2. Do you think we should make this a dynamic config?
> > >> > > > I'm not sure. Looking at the config from the perspective of a
> > >> > > prescriptive
> > >> > > > config, we may get away with not updating it dynamically.
> > >> > > > But in my opinion, it always makes sense to have a config be
> > >> > dynamically
> > >> > > > configurable. As long as we limit it to being a cluster-wide
> > >> config, we
> > >> > > > should be fine.
> > >> > > >
> > >> > > > > 1. I think it would be helpful to clarify the details on how the
> > >> > > > coordinator will shrink the group. It will need to choose which
> > >> members
> > >> > > to
> > >> > > > remove. Are we going to give current members an opportunity to
> > >> commit
> > >> > > > offsets before kicking them from the group?
> > >> > > >
> > >> > > > This turns out to be somewhat tricky. I think that we may not be
> > >> able
> > >> > to
> > >> > > > guarantee that consumers don't process a message twice.
> > >> > > > My initial approach was to do as much as we could to let consumers
> > >> > commit
> > >> > > > offsets.
> > >> > > >
> > >> > > > I was thinking that we mark a group to be shrunk, we could keep a
> > >> map
> > >> > of
> > >> > > > consumer_id->boolean indicating whether they have committed
> > >> offsets. I
> > >> > > then
> > >> > > > thought we could delay the rebalance until every consumer commits
> > >> (or
> > >> > > some
> > >> > > > time passes).
> > >> > > > In the meantime, we would block all incoming fetch calls (by
> > either
> > >> > > > returning empty records or a retriable error) and we would
> > continue
> > >> to
> > >> > > > accept offset commits (even twice for a single consumer)
> > >> > > >
> > >> > > > I see two problems with this approach:
> > >> > > > * We have async offset commits, which implies that we can receive
> > >> fetch
> > >> > > > requests before the offset commit req has been handled. i.e
> > consmer
> > >> > sends
> > >> > > > fetchReq A, offsetCommit B, fetchReq C - we may receive A,C,B in
> > the
> > >> > > > broker. Meaning we could have saved the offsets for B but
> > rebalance
> > >> > > before
> > >> > > > the offsetCommit for the offsets processed in C come in.
> > >> > > > * KIP-392 Allow consumers to fetch from closest replica
> > >> > > > <
> > >> > > >
> > >> > >
> > >> >
> > >>
> > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-392%253A%2BAllow%2Bconsumers%2Bto%2Bfetch%2Bfrom%2Bclosest%2Breplica&amp;data=02%7C01%7C%7Cb603e099d6c744d8fac708d65ed51d03%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636800666735874264&amp;sdata=bekXj%2FVdA6flZWQ70%2BSEyHm31%2F2WyWO1EpbvqyjWFJw%3D&amp;reserved=0
> > >> > > > >
> > >> > > > would
> > >> > > > make it significantly harder to block poll() calls on consumers
> > >> whose
> > >> > > > groups are being shrunk. Even if we implemented a solution, the
> > same
> > >> > race
> > >> > > > condition noted above seems to apply and probably others
> > >> > > >
> > >> > > >
> > >> > > > Given those constraints, I think that we can simply mark the group
> > >> as
> > >> > > > `PreparingRebalance` with a rebalanceTimeout of the server
> > setting `
> > >> > > > group.max.session.timeout.ms`. That's a bit long by default (5
> > >> > minutes)
> > >> > > > but
> > >> > > > I can't seem to come up with a better alternative
> > >> > > >
> > >> > > > I'm interested in hearing your thoughts.
> > >> > > >
> > >> > > > Thanks,
> > >> > > > Stanislav
> > >> > > >
> > >> > > > On Fri, Nov 30, 2018 at 8:38 AM Jason Gustafson <
> > jason@confluent.io
> > >> >
> > >> > > > wrote:
> > >> > > >
> > >> > > > > Hey Stanislav,
> > >> > > > >
> > >> > > > > What do you think about the use case I mentioned in my previous
> > >> reply
> > >> > > > about
> > >> > > > > > a more resilient self-service Kafka? I believe the benefit
> > >> there is
> > >> > > > > bigger.
> > >> > > > >
> > >> > > > >
> > >> > > > > I see this config as analogous to the open file limit. Probably
> > >> this
> > >> > > > limit
> > >> > > > > was intended to be prescriptive at some point about what was
> > >> deemed a
> > >> > > > > reasonable number of open files for an application. But mostly
> > >> people
> > >> > > > treat
> > >> > > > > it as an annoyance which they have to work around. If it happens
> > >> to
> > >> > be
> > >> > > > hit,
> > >> > > > > usually you just increase it because it is not tied to an actual
> > >> > > resource
> > >> > > > > constraint. However, occasionally hitting the limit does
> > indicate
> > >> an
> > >> > > > > application bug such as a leak, so I wouldn't say it is useless.
> > >> > > > Similarly,
> > >> > > > > the issue in KAFKA-7610 was a consumer leak and having this
> > limit
> > >> > would
> > >> > > > > have allowed the problem to be detected before it impacted the
> > >> > cluster.
> > >> > > > To
> > >> > > > > me, that's the main benefit. It's possible that it could be used
> > >> > > > > prescriptively to prevent poor usage of groups, but like the
> > open
> > >> > file
> > >> > > > > limit, I suspect administrators will just set it large enough
> > that
> > >> > > users
> > >> > > > > are unlikely to complain.
> > >> > > > >
> > >> > > > > Anyway, just a couple additional questions:
> > >> > > > >
> > >> > > > > 1. I think it would be helpful to clarify the details on how the
> > >> > > > > coordinator will shrink the group. It will need to choose which
> > >> > members
> > >> > > > to
> > >> > > > > remove. Are we going to give current members an opportunity to
> > >> commit
> > >> > > > > offsets before kicking them from the group?
> > >> > > > >
> > >> > > > > 2. Do you think we should make this a dynamic config?
> > >> > > > >
> > >> > > > > Thanks,
> > >> > > > > Jason
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > > On Wed, Nov 28, 2018 at 2:42 AM Stanislav Kozlovski <
> > >> > > > > stanislav@confluent.io>
> > >> > > > > wrote:
> > >> > > > >
> > >> > > > > > Hi Jason,
> > >> > > > > >
> > >> > > > > > You raise some very valid points.
> > >> > > > > >
> > >> > > > > > > The benefit of this KIP is probably limited to preventing
> > >> > "runaway"
> > >> > > > > > consumer groups due to leaks or some other application bug
> > >> > > > > > What do you think about the use case I mentioned in my
> > previous
> > >> > reply
> > >> > > > > about
> > >> > > > > > a more resilient self-service Kafka? I believe the benefit
> > >> there is
> > >> > > > > bigger
> > >> > > > > >
> > >> > > > > > * Default value
> > >> > > > > > You're right, we probably do need to be conservative. Big
> > >> consumer
> > >> > > > groups
> > >> > > > > > are considered an anti-pattern and my goal was to also hint at
> > >> this
> > >> > > > > through
> > >> > > > > > the config's default. Regardless, it is better to not have the
> > >> > > > potential
> > >> > > > > to
> > >> > > > > > break applications with an upgrade.
> > >> > > > > > Choosing between the default of something big like 5000 or an
> > >> > opt-in
> > >> > > > > > option, I think we should go with the *disabled default
> > option*
> > >> > > (-1).
> > >> > > > > > The only benefit we would get from a big default of 5000 is
> > >> default
> > >> > > > > > protection against buggy/malicious applications that hit the
> > >> > > KAFKA-7610
> > >> > > > > > issue.
> > >> > > > > > While this KIP was spawned from that issue, I believe its
> > value
> > >> is
> > >> > > > > enabling
> > >> > > > > > the possibility of protection and helping move towards a more
> > >> > > > > self-service
> > >> > > > > > Kafka. I also think that a default value of 5000 might be
> > >> > misleading
> > >> > > to
> > >> > > > > > users and lead them to think that big consumer groups (> 250)
> > >> are a
> > >> > > > good
> > >> > > > > > thing.
> > >> > > > > >
> > >> > > > > > The good news is that KAFKA-7610 should be fully resolved and
> > >> the
> > >> > > > > rebalance
> > >> > > > > > protocol should, in general, be more solid after the planned
> > >> > > > improvements
> > >> > > > > > in KIP-345 and KIP-394.
> > >> > > > > >
> > >> > > > > > * Handling bigger groups during upgrade
> > >> > > > > > I now see that we store the state of consumer groups in the
> > log
> > >> and
> > >> > > > why a
> > >> > > > > > rebalance isn't expected during a rolling upgrade.
> > >> > > > > > Since we're going with the default value of the max.size being
> > >> > > > disabled,
> > >> > > > > I
> > >> > > > > > believe we can afford to be more strict here.
> > >> > > > > > During state reloading of a new Coordinator with a defined
> > >> > > > max.group.size
> > >> > > > > > config, I believe we should *force* rebalances for groups that
> > >> > exceed
> > >> > > > the
> > >> > > > > > configured size. Then, only some consumers will be able to
> > join
> > >> and
> > >> > > the
> > >> > > > > max
> > >> > > > > > size invariant will be satisfied.
> > >> > > > > >
> > >> > > > > > I updated the KIP with a migration plan, rejected alternatives
> > >> and
> > >> > > the
> > >> > > > > new
> > >> > > > > > default value.
> > >> > > > > >
> > >> > > > > > Thanks,
> > >> > > > > > Stanislav
> > >> > > > > >
> > >> > > > > > On Tue, Nov 27, 2018 at 5:25 PM Jason Gustafson <
> > >> > jason@confluent.io>
> > >> > > > > > wrote:
> > >> > > > > >
> > >> > > > > > > Hey Stanislav,
> > >> > > > > > >
> > >> > > > > > > Clients will then find that coordinator
> > >> > > > > > > > and send `joinGroup` on it, effectively rebuilding the
> > >> group,
> > >> > > since
> > >> > > > > the
> > >> > > > > > > > cache of active consumers is not stored outside the
> > >> > Coordinator's
> > >> > > > > > memory.
> > >> > > > > > > > (please do say if that is incorrect)
> > >> > > > > > >
> > >> > > > > > >
> > >> > > > > > > Groups do not typically rebalance after a coordinator
> > change.
> > >> You
> > >> > > > could
> > >> > > > > > > potentially force a rebalance if the group is too big and
> > kick
> > >> > out
> > >> > > > the
> > >> > > > > > > slowest members or something. A more graceful solution is
> > >> > probably
> > >> > > to
> > >> > > > > > just
> > >> > > > > > > accept the current size and prevent it from getting bigger.
> > We
> > >> > > could
> > >> > > > > log
> > >> > > > > > a
> > >> > > > > > > warning potentially.
> > >> > > > > > >
> > >> > > > > > > My thinking is that we should abstract away from conserving
> > >> > > resources
> > >> > > > > and
> > >> > > > > > > > focus on giving control to the broker. The issue that
> > >> spawned
> > >> > > this
> > >> > > > > KIP
> > >> > > > > > > was
> > >> > > > > > > > a memory problem but I feel this change is useful in a
> > more
> > >> > > general
> > >> > > > > > way.
> > >> > > > > > >
> > >> > > > > > >
> > >> > > > > > > So you probably already know why I'm asking about this. For
> > >> > > consumer
> > >> > > > > > groups
> > >> > > > > > > anyway, resource usage would typically be proportional to
> > the
> > >> > > number
> > >> > > > of
> > >> > > > > > > partitions that a group is reading from and not the number
> > of
> > >> > > > members.
> > >> > > > > > For
> > >> > > > > > > example, consider the memory use in the offsets cache. The
> > >> > benefit
> > >> > > of
> > >> > > > > > this
> > >> > > > > > > KIP is probably limited to preventing "runaway" consumer
> > >> groups
> > >> > due
> > >> > > > to
> > >> > > > > > > leaks or some other application bug. That still seems useful
> > >> > > though.
> > >> > > > > > >
> > >> > > > > > > I completely agree with this and I *ask everybody to chime
> > in
> > >> > with
> > >> > > > > > opinions
> > >> > > > > > > > on a sensible default value*.
> > >> > > > > > >
> > >> > > > > > >
> > >> > > > > > > I think we would have to be very conservative. The group
> > >> protocol
> > >> > > is
> > >> > > > > > > generic in some sense, so there may be use cases we don't
> > >> know of
> > >> > > > where
> > >> > > > > > > larger groups are reasonable. Probably we should make this
> > an
> > >> > > opt-in
> > >> > > > > > > feature so that we do not risk breaking anyone's application
> > >> > after
> > >> > > an
> > >> > > > > > > upgrade. Either that, or use a very high default like 5,000.
> > >> > > > > > >
> > >> > > > > > > Thanks,
> > >> > > > > > > Jason
> > >> > > > > > >
> > >> > > > > > > On Tue, Nov 27, 2018 at 3:27 AM Stanislav Kozlovski <
> > >> > > > > > > stanislav@confluent.io>
> > >> > > > > > > wrote:
> > >> > > > > > >
> > >> > > > > > > > Hey Jason and Boyang, those were important comments
> > >> > > > > > > >
> > >> > > > > > > > > One suggestion I have is that it would be helpful to put
> > >> your
> > >> > > > > > reasoning
> > >> > > > > > > > on deciding the current default value. For example, in
> > >> certain
> > >> > > use
> > >> > > > > > cases
> > >> > > > > > > at
> > >> > > > > > > > Pinterest we are very likely to have more consumers than
> > 250
> > >> > when
> > >> > > > we
> > >> > > > > > > > configure 8 stream instances with 32 threads.
> > >> > > > > > > > > For the effectiveness of this KIP, we should encourage
> > >> people
> > >> > > to
> > >> > > > > > > discuss
> > >> > > > > > > > their opinions on the default setting and ideally reach a
> > >> > > > consensus.
> > >> > > > > > > >
> > >> > > > > > > > I completely agree with this and I *ask everybody to chime
> > >> in
> > >> > > with
> > >> > > > > > > opinions
> > >> > > > > > > > on a sensible default value*.
> > >> > > > > > > > My thought process was that in the current model
> > rebalances
> > >> in
> > >> > > > large
> > >> > > > > > > groups
> > >> > > > > > > > are more costly. I imagine most use cases in most Kafka
> > >> users
> > >> > do
> > >> > > > not
> > >> > > > > > > > require more than 250 consumers.
> > >> > > > > > > > Boyang, you say that you are "likely to have... when
> > we..."
> > >> -
> > >> > do
> > >> > > > you
> > >> > > > > > have
> > >> > > > > > > > systems running with so many consumers in a group or are
> > you
> > >> > > > planning
> > >> > > > > > > to? I
> > >> > > > > > > > guess what I'm asking is whether this has been tested in
> > >> > > production
> > >> > > > > > with
> > >> > > > > > > > the current rebalance model (ignoring KIP-345)
> > >> > > > > > > >
> > >> > > > > > > > >  Can you clarify the compatibility impact here? What
> > >> > > > > > > > > will happen to groups that are already larger than the
> > max
> > >> > > size?
> > >> > > > > > > > This is a very important question.
> > >> > > > > > > > From my current understanding, when a coordinator broker
> > >> gets
> > >> > > shut
> > >> > > > > > > > down during a cluster rolling upgrade, a replica will take
> > >> > > > leadership
> > >> > > > > > of
> > >> > > > > > > > the `__offset_commits` partition. Clients will then find
> > >> that
> > >> > > > > > coordinator
> > >> > > > > > > > and send `joinGroup` on it, effectively rebuilding the
> > >> group,
> > >> > > since
> > >> > > > > the
> > >> > > > > > > > cache of active consumers is not stored outside the
> > >> > Coordinator's
> > >> > > > > > memory.
> > >> > > > > > > > (please do say if that is incorrect)
> > >> > > > > > > > Then, I believe that working as if this is a new group is
> > a
> > >> > > > > reasonable
> > >> > > > > > > > approach. Namely, fail joinGroups when the max.size is
> > >> > exceeded.
> > >> > > > > > > > What do you guys think about this? (I'll update the KIP
> > >> after
> > >> > we
> > >> > > > > settle
> > >> > > > > > > on
> > >> > > > > > > > a solution)
> > >> > > > > > > >
> > >> > > > > > > > >  Also, just to be clear, the resource we are trying to
> > >> > conserve
> > >> > > > > here
> > >> > > > > > is
> > >> > > > > > > > what? Memory?
> > >> > > > > > > > My thinking is that we should abstract away from
> > conserving
> > >> > > > resources
> > >> > > > > > and
> > >> > > > > > > > focus on giving control to the broker. The issue that
> > >> spawned
> > >> > > this
> > >> > > > > KIP
> > >> > > > > > > was
> > >> > > > > > > > a memory problem but I feel this change is useful in a
> > more
> > >> > > general
> > >> > > > > > way.
> > >> > > > > > > It
> > >> > > > > > > > limits the control clients have on the cluster and helps
> > >> Kafka
> > >> > > > > become a
> > >> > > > > > > > more self-serving system. Admin/Ops teams can better
> > control
> > >> > the
> > >> > > > > impact
> > >> > > > > > > > application developers can have on a Kafka cluster with
> > this
> > >> > > change
> > >> > > > > > > >
> > >> > > > > > > > Best,
> > >> > > > > > > > Stanislav
> > >> > > > > > > >
> > >> > > > > > > >
> > >> > > > > > > > On Mon, Nov 26, 2018 at 8:00 PM Jason Gustafson <
> > >> > > > jason@confluent.io>
> > >> > > > > > > > wrote:
> > >> > > > > > > >
> > >> > > > > > > > > Hi Stanislav,
> > >> > > > > > > > >
> > >> > > > > > > > > Thanks for the KIP. Can you clarify the compatibility
> > >> impact
> > >> > > > here?
> > >> > > > > > What
> > >> > > > > > > > > will happen to groups that are already larger than the
> > max
> > >> > > size?
> > >> > > > > > Also,
> > >> > > > > > > > just
> > >> > > > > > > > > to be clear, the resource we are trying to conserve here
> > >> is
> > >> > > what?
> > >> > > > > > > Memory?
> > >> > > > > > > > >
> > >> > > > > > > > > -Jason
> > >> > > > > > > > >
> > >> > > > > > > > > On Mon, Nov 26, 2018 at 2:44 AM Boyang Chen <
> > >> > > bchen11@outlook.com
> > >> > > > >
> > >> > > > > > > wrote:
> > >> > > > > > > > >
> > >> > > > > > > > > > Thanks Stanislav for the update! One suggestion I have
> > >> is
> > >> > > that
> > >> > > > it
> > >> > > > > > > would
> > >> > > > > > > > > be
> > >> > > > > > > > > > helpful to put your
> > >> > > > > > > > > >
> > >> > > > > > > > > > reasoning on deciding the current default value. For
> > >> > example,
> > >> > > > in
> > >> > > > > > > > certain
> > >> > > > > > > > > > use cases at Pinterest we are very likely
> > >> > > > > > > > > >
> > >> > > > > > > > > > to have more consumers than 250 when we configure 8
> > >> stream
> > >> > > > > > instances
> > >> > > > > > > > with
> > >> > > > > > > > > > 32 threads.
> > >> > > > > > > > > >
> > >> > > > > > > > > >
> > >> > > > > > > > > > For the effectiveness of this KIP, we should encourage
> > >> > people
> > >> > > > to
> > >> > > > > > > > discuss
> > >> > > > > > > > > > their opinions on the default setting and ideally
> > reach
> > >> a
> > >> > > > > > consensus.
> > >> > > > > > > > > >
> > >> > > > > > > > > >
> > >> > > > > > > > > > Best,
> > >> > > > > > > > > >
> > >> > > > > > > > > > Boyang
> > >> > > > > > > > > >
> > >> > > > > > > > > > ________________________________
> > >> > > > > > > > > > From: Stanislav Kozlovski <stanislav@confluent.io>
> > >> > > > > > > > > > Sent: Monday, November 26, 2018 6:14 PM
> > >> > > > > > > > > > To: dev@kafka.apache.org
> > >> > > > > > > > > > Subject: Re: [Discuss] KIP-389: Enforce group.max.size
> > >> to
> > >> > cap
> > >> > > > > > member
> > >> > > > > > > > > > metadata growth
> > >> > > > > > > > > >
> > >> > > > > > > > > > Hey everybody,
> > >> > > > > > > > > >
> > >> > > > > > > > > > It's been a week since this KIP and not much
> > discussion
> > >> has
> > >> > > > been
> > >> > > > > > > made.
> > >> > > > > > > > > > I assume that this is a straight forward change and I
> > >> will
> > >> > > > open a
> > >> > > > > > > > voting
> > >> > > > > > > > > > thread in the next couple of days if nobody has
> > >> anything to
> > >> > > > > > suggest.
> > >> > > > > > > > > >
> > >> > > > > > > > > > Best,
> > >> > > > > > > > > > Stanislav
> > >> > > > > > > > > >
> > >> > > > > > > > > > On Thu, Nov 22, 2018 at 12:56 PM Stanislav Kozlovski <
> > >> > > > > > > > > > stanislav@confluent.io>
> > >> > > > > > > > > > wrote:
> > >> > > > > > > > > >
> > >> > > > > > > > > > > Greetings everybody,
> > >> > > > > > > > > > >
> > >> > > > > > > > > > > I have enriched the KIP a bit with a bigger
> > Motivation
> > >> > > > section
> > >> > > > > > and
> > >> > > > > > > > also
> > >> > > > > > > > > > > renamed it.
> > >> > > > > > > > > > > KIP:
> > >> > > > > > > > > > >
> > >> > > > > > > > > >
> > >> > > > > > > > >
> > >> > > > > > > >
> > >> > > > > > >
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-389%253A%2BIntroduce%2Ba%2Bconfigurable%2Bconsumer%2Bgroup%2Bsize%2Blimit&amp;data=02%7C01%7C%7Cb603e099d6c744d8fac708d65ed51d03%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636800666735874264&amp;sdata=dLVLofL8NnQatVq6WEDukxfIorh7HeQR9TyyUifcAPo%3D&amp;reserved=0
> > >> > > > > > > > > > >
> > >> > > > > > > > > > > I'm looking forward to discussions around it.
> > >> > > > > > > > > > >
> > >> > > > > > > > > > > Best,
> > >> > > > > > > > > > > Stanislav
> > >> > > > > > > > > > >
> > >> > > > > > > > > > > On Tue, Nov 20, 2018 at 1:47 PM Stanislav Kozlovski
> > <
> > >> > > > > > > > > > > stanislav@confluent.io> wrote:
> > >> > > > > > > > > > >
> > >> > > > > > > > > > >> Hey there everybody,
> > >> > > > > > > > > > >>
> > >> > > > > > > > > > >> Thanks for the introduction Boyang. I appreciate
> > the
> > >> > > effort
> > >> > > > > you
> > >> > > > > > > are
> > >> > > > > > > > > > >> putting into improving consumer behavior in Kafka.
> > >> > > > > > > > > > >>
> > >> > > > > > > > > > >> @Matt
> > >> > > > > > > > > > >> I also believe the default value is high. In my
> > >> opinion,
> > >> > > we
> > >> > > > > > should
> > >> > > > > > > > aim
> > >> > > > > > > > > > to
> > >> > > > > > > > > > >> a default cap around 250. This is because in the
> > >> current
> > >> > > > model
> > >> > > > > > any
> > >> > > > > > > > > > consumer
> > >> > > > > > > > > > >> rebalance is disrupting to every consumer. The
> > bigger
> > >> > the
> > >> > > > > group,
> > >> > > > > > > the
> > >> > > > > > > > > > longer
> > >> > > > > > > > > > >> this period of disruption.
> > >> > > > > > > > > > >>
> > >> > > > > > > > > > >> If you have such a large consumer group, chances
> > are
> > >> > that
> > >> > > > your
> > >> > > > > > > > > > >> client-side logic could be structured better and
> > that
> > >> > you
> > >> > > > are
> > >> > > > > > not
> > >> > > > > > > > > using
> > >> > > > > > > > > > the
> > >> > > > > > > > > > >> high number of consumers to achieve high
> > throughput.
> > >> > > > > > > > > > >> 250 can still be considered of a high upper bound,
> > I
> > >> > > believe
> > >> > > > > in
> > >> > > > > > > > > practice
> > >> > > > > > > > > > >> users should aim to not go over 100 consumers per
> > >> > consumer
> > >> > > > > > group.
> > >> > > > > > > > > > >>
> > >> > > > > > > > > > >> In regards to the cap being global/per-broker, I
> > >> think
> > >> > > that
> > >> > > > we
> > >> > > > > > > > should
> > >> > > > > > > > > > >> consider whether we want it to be global or
> > >> *per-topic*.
> > >> > > For
> > >> > > > > the
> > >> > > > > > > > time
> > >> > > > > > > > > > >> being, I believe that having it per-topic with a
> > >> global
> > >> > > > > default
> > >> > > > > > > > might
> > >> > > > > > > > > be
> > >> > > > > > > > > > >> the best situation. Having it global only seems a
> > bit
> > >> > > > > > restricting
> > >> > > > > > > to
> > >> > > > > > > > > me
> > >> > > > > > > > > > and
> > >> > > > > > > > > > >> it never hurts to support more fine-grained
> > >> > > configurability
> > >> > > > > > (given
> > >> > > > > > > > > it's
> > >> > > > > > > > > > the
> > >> > > > > > > > > > >> same config, not a new one being introduced).
> > >> > > > > > > > > > >>
> > >> > > > > > > > > > >> On Tue, Nov 20, 2018 at 11:32 AM Boyang Chen <
> > >> > > > > > bchen11@outlook.com
> > >> > > > > > > >
> > >> > > > > > > > > > wrote:
> > >> > > > > > > > > > >>
> > >> > > > > > > > > > >>> Thanks Matt for the suggestion! I'm still open to
> > >> any
> > >> > > > > > suggestion
> > >> > > > > > > to
> > >> > > > > > > > > > >>> change the default value. Meanwhile I just want to
> > >> > point
> > >> > > > out
> > >> > > > > > that
> > >> > > > > > > > > this
> > >> > > > > > > > > > >>> value is a just last line of defense, not a real
> > >> > scenario
> > >> > > > we
> > >> > > > > > > would
> > >> > > > > > > > > > expect.
> > >> > > > > > > > > > >>>
> > >> > > > > > > > > > >>>
> > >> > > > > > > > > > >>> In the meanwhile, I discussed with Stanislav and
> > he
> > >> > would
> > >> > > > be
> > >> > > > > > > > driving
> > >> > > > > > > > > > the
> > >> > > > > > > > > > >>> 389 effort from now on. Stanislav proposed the
> > idea
> > >> in
> > >> > > the
> > >> > > > > > first
> > >> > > > > > > > > place
> > >> > > > > > > > > > and
> > >> > > > > > > > > > >>> had already come up a draft design, while I will
> > >> keep
> > >> > > > > focusing
> > >> > > > > > on
> > >> > > > > > > > > > KIP-345
> > >> > > > > > > > > > >>> effort to ensure solving the edge case described
> > in
> > >> the
> > >> > > > JIRA<
> > >> > > > > > > > > > >>>
> > >> > > > > > > > > >
> > >> > > > > > > > >
> > >> > > > > > > >
> > >> > > > > > >
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fissues.apache.org%2Fjira%2Fbrowse%2FKAFKA-7610&amp;data=02%7C01%7C%7Cb603e099d6c744d8fac708d65ed51d03%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636800666735874264&amp;sdata=F55UaGVkDXaj4q7v7jUvPL50pD74GE90R7OGX%2FV3f%2Fs%3D&amp;reserved=0
> > >> > > > > > > > > > >.
> > >> > > > > > > > > > >>>
> > >> > > > > > > > > > >>>
> > >> > > > > > > > > > >>> Thank you Stanislav for making this happen!
> > >> > > > > > > > > > >>>
> > >> > > > > > > > > > >>>
> > >> > > > > > > > > > >>> Boyang
> > >> > > > > > > > > > >>>
> > >> > > > > > > > > > >>> ________________________________
> > >> > > > > > > > > > >>> From: Matt Farmer <matt@frmr.me>
> > >> > > > > > > > > > >>> Sent: Tuesday, November 20, 2018 10:24 AM
> > >> > > > > > > > > > >>> To: dev@kafka.apache.org
> > >> > > > > > > > > > >>> Subject: Re: [Discuss] KIP-389: Enforce
> > >> group.max.size
> > >> > to
> > >> > > > cap
> > >> > > > > > > > member
> > >> > > > > > > > > > >>> metadata growth
> > >> > > > > > > > > > >>>
> > >> > > > > > > > > > >>> Thanks for the KIP.
> > >> > > > > > > > > > >>>
> > >> > > > > > > > > > >>> Will this cap be a global cap across the entire
> > >> cluster
> > >> > > or
> > >> > > > > per
> > >> > > > > > > > > broker?
> > >> > > > > > > > > > >>>
> > >> > > > > > > > > > >>> Either way the default value seems a bit high to
> > me,
> > >> > but
> > >> > > > that
> > >> > > > > > > could
> > >> > > > > > > > > > just
> > >> > > > > > > > > > >>> be
> > >> > > > > > > > > > >>> from my own usage patterns. I'd have probably
> > >> started
> > >> > > with
> > >> > > > > 500
> > >> > > > > > or
> > >> > > > > > > > 1k
> > >> > > > > > > > > > but
> > >> > > > > > > > > > >>> could be easily convinced that's wrong.
> > >> > > > > > > > > > >>>
> > >> > > > > > > > > > >>> Thanks,
> > >> > > > > > > > > > >>> Matt
> > >> > > > > > > > > > >>>
> > >> > > > > > > > > > >>> On Mon, Nov 19, 2018 at 8:51 PM Boyang Chen <
> > >> > > > > > bchen11@outlook.com
> > >> > > > > > > >
> > >> > > > > > > > > > wrote:
> > >> > > > > > > > > > >>>
> > >> > > > > > > > > > >>> > Hey folks,
> > >> > > > > > > > > > >>> >
> > >> > > > > > > > > > >>> >
> > >> > > > > > > > > > >>> > I would like to start a discussion on KIP-389:
> > >> > > > > > > > > > >>> >
> > >> > > > > > > > > > >>> >
> > >> > > > > > > > > > >>> >
> > >> > > > > > > > > > >>>
> > >> > > > > > > > > >
> > >> > > > > > > > >
> > >> > > > > > > >
> > >> > > > > > >
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-389%253A%2BEnforce%2Bgroup.max.size%2Bto%2Bcap%2Bmember%2Bmetadata%2Bgrowth&amp;data=02%7C01%7C%7Cb603e099d6c744d8fac708d65ed51d03%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636800666735874264&amp;sdata=n%2FHp2DM4k48Q9hayOlc8q5VlcBKFtVWnLDOAzm%2FZ25Y%3D&amp;reserved=0
> > >> > > > > > > > > > >>> >
> > >> > > > > > > > > > >>> >
> > >> > > > > > > > > > >>> > This is a pretty simple change to cap the
> > consumer
> > >> > > group
> > >> > > > > size
> > >> > > > > > > for
> > >> > > > > > > > > > >>> broker
> > >> > > > > > > > > > >>> > stability. Give me your valuable feedback when
> > you
> > >> > got
> > >> > > > > time.
> > >> > > > > > > > > > >>> >
> > >> > > > > > > > > > >>> >
> > >> > > > > > > > > > >>> > Thank you!
> > >> > > > > > > > > > >>> >
> > >> > > > > > > > > > >>>
> > >> > > > > > > > > > >>
> > >> > > > > > > > > > >>
> > >> > > > > > > > > > >> --
> > >> > > > > > > > > > >> Best,
> > >> > > > > > > > > > >> Stanislav
> > >> > > > > > > > > > >>
> > >> > > > > > > > > > >
> > >> > > > > > > > > > >
> > >> > > > > > > > > > > --
> > >> > > > > > > > > > > Best,
> > >> > > > > > > > > > > Stanislav
> > >> > > > > > > > > > >
> > >> > > > > > > > > >
> > >> > > > > > > > > >
> > >> > > > > > > > > > --
> > >> > > > > > > > > > Best,
> > >> > > > > > > > > > Stanislav
> > >> > > > > > > > > >
> > >> > > > > > > > >
> > >> > > > > > > >
> > >> > > > > > > >
> > >> > > > > > > > --
> > >> > > > > > > > Best,
> > >> > > > > > > > Stanislav
> > >> > > > > > > >
> > >> > > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > > > --
> > >> > > > > > Best,
> > >> > > > > > Stanislav
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > > >
> > >> > > > --
> > >> > > > Best,
> > >> > > > Stanislav
> > >> > > >
> > >> > >
> > >> > >
> > >> > > --
> > >> > > Best,
> > >> > > Stanislav
> > >> > >
> > >> >
> > >> >
> > >> > --
> > >> > Best,
> > >> > Stanislav
> > >> >
> > >>
> > >
> > >
> > > --
> > > Best,
> > > Stanislav
> > >
> >
> >
> > --
> > Best,
> > Stanislav
> >
>
>
> --
> Best,
> Stanislav



-- 
Gwen Shapira
Product Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter | blog


Mime
View raw message