kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Sean Rohead (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-6101) Reconnecting to broker does not exponentially backoff
Date Sat, 21 Oct 2017 00:01:29 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-6101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16213531#comment-16213531

Sean Rohead commented on KAFKA-6101:

Looking at your patch strictly from a code review perspective (not having yet tested it),
I see that the current implementation of disconnected() calls updateReconnectBackoff(nodeState)
which modifies the value of nodeState.reconnectBackoffMs. This value is not being preserved
when you are creating the new nodeState in connecting() so the value will be reset back to
reconnectBackoffInitMs. I not 100% certain, but I think the new nodeState should preserve
that value across connection attempts. I guess my other question would be if it is possible
to just leave the existing nodeState instance in the map (if there is one) instead of creating
a new one -- just update the state and lastConnectAttemptMs. This is what the disconnected()
method does.

> Reconnecting to broker does not exponentially backoff
> -----------------------------------------------------
>                 Key: KAFKA-6101
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6101
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients
>    Affects Versions:
>            Reporter: Sean Rohead
>         Attachments: 6101.v2.txt, text.html
> I am using com.typesafe.akka:akka-stream-kafka:0.17 which relies on kafka-clients:
> I have set the reconnect.backoff.max.ms property to 60000.
> When I start the application without kafka running, I see a flood of the following log
> [warn] o.a.k.c.NetworkClient - Connection to node -1 could not be established. Broker
may not be available.
> The log messages occur several times a second and the frequency of these messages does
not decrease over time as would be expected if exponential backoff was working properly.
> I set a breakpoint in the debugger in ClusterConnectionStates:188 and noticed that every
time this breakpoint is hit, nodeState.failedAttempts is always 0. This is why the delay does
not increase exponentially. It also appears that every time the breakpoint is hit, it is on
a different instance, so even though the number of failedAttempts is incremented, we never
get the breakpoint for the same instance more than one time.

This message was sent by Atlassian JIRA

View raw message