kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Colin McCabe <cmcc...@apache.org>
Subject Re: [DISCUSS] KIP-144: Exponential backoff for broker reconnect attempts
Date Mon, 01 May 2017 17:18:14 GMT
Thanks for the KIP, Ismael & Dana!  This could be pretty important for
avoiding congestion collapse when there are a lot of clients.

It seems like a good idea to keep the "ms" suffix, like we have with
"reconnect.backoff.ms".  So maybe we should use
"reconnect.backoff.max.ms"?  In general unitless timeouts can be the
source of a lot of confusion (is it seconds, milliseconds, etc.?)

It's good that the KIP inject random delays (jitter) into the timeout. 
As per Gwen's point, does it make sense to put an upper bound on the
jitter, though?  If someone sets reconnect.backoff.max to 5 minutes,
they probably would be a little surprised to find it doing three retries
after 100 ms in a row (as it could under the current scheme.)  Maybe a
maximum jitter configuration would help address that, and make the
behavior a little more intuitive.


On Thu, Apr 27, 2017, at 09:39, Gwen Shapira wrote:
> This is a great suggestion. I like how we just do it by default instead
> of
> making it a choice users need to figure out.
> Avoiding connection storms is great.
> One concern. If I understand the formula for effective maximum backoff
> correctly, then with default maximum of 1000ms and default backoff of
> 100ms, the effective maximum backoff will be 450ms rather than 1000ms.
> This
> isn't exactly intuitive.
> I'm wondering if it makes more sense to allow "one last doubling" which
> may
> bring us slightly over the maximum, but much closer to it. I.e. have the
> effective maximum be in [max.backoff - backoff, max.backoff + backoff]
> range rather than half that. Does that make sense?
> Gwen
> On Thu, Apr 27, 2017 at 9:06 AM, Ismael Juma <ismael@juma.me.uk> wrote:
> > Hi all,
> >
> > Dana Powers posted a PR a while back for exponential backoff for broker
> > reconnect attempts. Because it adds a config, a KIP is required and Dana
> > seems to be busy so I posted it:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 144%3A+Exponential+backoff+for+broker+reconnect+attempts
> >
> > Please take a look. Your feedback is appreciated.
> >
> > Thanks,
> > Ismael
> >
> -- 
> *Gwen Shapira*
> Product Manager | Confluent
> 650.450.2760 | @gwenshap
> Follow us: Twitter <https://twitter.com/ConfluentInc> | blog
> <http://www.confluent.io/blog>

View raw message