kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raoufeh Hashemian (JIRA)" <j...@apache.org>
Subject [jira] [Created] (KAFKA-5780) Long shutdown time when updated to 0.11.0
Date Thu, 24 Aug 2017 14:31:00 GMT
Raoufeh Hashemian created KAFKA-5780:
----------------------------------------

             Summary: Long shutdown time when updated to 0.11.0
                 Key: KAFKA-5780
                 URL: https://issues.apache.org/jira/browse/KAFKA-5780
             Project: Kafka
          Issue Type: Bug
          Components: core
    Affects Versions: 0.11.0.0
         Environment: CentOS Linux release 7.3.1611 , Kernel 3.10
            Reporter: Raoufeh Hashemian
         Attachments: broker_shutdown.png

When we switched from Kafka 0.10.2 to Kafka 0.11.0 , We faced a problem with stopping the
kafka service on a broker node.

Our cluster consists of 6 broker nodes. We had an existing topic when switched to Kafka 0.11.0
. Since then, gracefully stoping the service on a Kafka broker node results in the following
warning message being repeated every 100 ms in the broker log, and the shutdown takes approximately
45 minutes to complete.

{code:java}
@40000000599714da1e582e4c [2017-08-18 16:24:48,509] WARN Connection to node 1002 could not
be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
@40000000599714da245483a4 [2017-08-18 16:24:48,609] WARN Connection to node 1002 could not
be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
@40000000599714da2a51177c [2017-08-18 16:24:48,709] WARN Connection to node 1002 could not
be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
{code}

Below is the last log lines when the shutdown is complete :

{code:java}
@4000000059971afd31113dbc [2017-08-18 16:50:59,823] WARN Connection to node 1002 could not
be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
@4000000059971afd361200bc [2017-08-18 16:50:59,907] INFO Shutdown complete. (kafka.log.LogManager)
@4000000059971afd36afa04c [2017-08-18 16:50:59,917] INFO Terminate ZkClient event thread.
(org.I0Itec.zkclient.ZkEventThread)
@4000000059971afd36dd6edc [2017-08-18 16:50:59,920] INFO Session: 0x35d68c9e76702a4 closed
(org.apache.zookeeper.ZooKeeper)
@4000000059971afd36deca84 [2017-08-18 16:50:59,920] INFO EventThread shut down for session:
0x35d68c9e76702a4 (org.apache.zookeeper.ClientCnxn)
@4000000059971afd36f6afb4 [2017-08-18 16:50:59,922] INFO [Kafka Server 1002], shut down completed
(kafka.server.KafkaServer)
{code}

I should note that I stopped the producers before shutting down the broker.
If I repeat the process after brining up the service, the shutdown takes less than a minute.
However, if I start the producers even for a short time and repeat the process, it will again
take around 45 minutes to do a graceful shutdown.

Attached files shows the brokers CPU usage during the shutdown period (light blue curve is
the node in which the broker service is shutting down).
The size of the topic is 2.3 TB per broker.

I was wondering if this is an expected new normal in Kafka 0.11.0 or It is a bug or a mis
configuration?




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message