[ https://issues.apache.org/jira/browse/KAFKA-6051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Maytee Chinavanichkit updated KAFKA-6051:
-----------------------------------------
Fix Version/s: (was: 1.0.0)
1.1.0
> ReplicaFetcherThread should close the ReplicaFetcherBlockingSend earlier on shutdown
> ------------------------------------------------------------------------------------
>
> Key: KAFKA-6051
> URL: https://issues.apache.org/jira/browse/KAFKA-6051
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 0.10.0.0, 0.10.0.1, 0.10.1.0, 0.10.1.1, 0.10.2.0, 0.10.2.1, 0.11.0.0
> Reporter: Maytee Chinavanichkit
> Fix For: 1.1.0
>
>
> The ReplicaFetcherBlockingSend works as designed and will blocks until it is able to
get data. This becomes a problem when we are gracefully shutting down a broker. The controller
will attempt to shutdown the fetchers and elect new leaders. When the last fetch of partition
is removed, as part of the {{replicaManager.becomeLeaderOrFollower}} call will proceed to
shut down any idle ReplicaFetcherThread. The shutdown process here can block up to until the
last fetch request completes. This blocking delay is a big problem because the {{replicaStateChangeLock}},
and {{mapLock}} in {{AbstractFetcherManager}} is still locked causing latency spikes on multiple
brokers.
> At this point in time, we do not need the last response as the fetcher is shutting down.
We should close the leaderEndpoint early during {{initiateShutdown()}} instead of after {{super.shutdown()}}.
> For example we see here the shutdown blocked the broker from processing more replica
changes for ~500 ms
> {code}
> [2017-09-01 18:11:42,879] INFO [ReplicaFetcherThread-0-2], Shutting down (kafka.server.ReplicaFetcherThread)
> [2017-09-01 18:11:43,314] INFO [ReplicaFetcherThread-0-2], Stopped (kafka.server.ReplicaFetcherThread)
> [2017-09-01 18:11:43,314] INFO [ReplicaFetcherThread-0-2], Shutdown completed (kafka.server.ReplicaFetcherThread)
> {code}
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
|