kafka-jira mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Raoufeh Hashemian (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (KAFKA-5780) Long shutdown time when updated to 0.11.0
Date Thu, 24 Aug 2017 15:30:00 GMT

    [ https://issues.apache.org/jira/browse/KAFKA-5780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16140173#comment-16140173
] 

Raoufeh Hashemian commented on KAFKA-5780:
------------------------------------------

broker 1002 is the same broker that is being shutdown and the log line appears every 100 ms
in the log file of the same broker. I am not sure why the message comes from ''org.apache.kafka.clients.NetworkClient"
and why should the broker that is being shutdown try to connect to itself. I'll try to upload
the full trace but it is basically just 45 * 60 * 10 of those warning lines and then: 
@4000000059971afd31113dbc [2017-08-18 16:50:59,823] WARN Connection to node 1002 could not
be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
@4000000059971afd361200bc [2017-08-18 16:50:59,907] INFO Shutdown complete. (kafka.log.LogManager)
@4000000059971afd36afa04c [2017-08-18 16:50:59,917] INFO Terminate ZkClient event thread.
(org.I0Itec.zkclient.ZkEventThread)
@4000000059971afd36dd6edc [2017-08-18 16:50:59,920] INFO Session: 0x35d68c9e76702a4 closed
(org.apache.zookeeper.ZooKeeper)
@4000000059971afd36deca84 [2017-08-18 16:50:59,920] INFO EventThread shut down for session:
0x35d68c9e76702a4 (org.apache.zookeeper.ClientCnxn)
@4000000059971afd36f6afb4 [2017-08-18 16:50:59,922] INFO [Kafka Server 1002], shut down completed
(kafka.server.KafkaServer)

> Long shutdown time when updated to 0.11.0
> -----------------------------------------
>
>                 Key: KAFKA-5780
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5780
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.11.0.0
>         Environment: CentOS Linux release 7.3.1611 , Kernel 3.10
>            Reporter: Raoufeh Hashemian
>             Fix For: 0.11.0.1
>
>         Attachments: broker_shutdown.png
>
>
> When we switched from Kafka 0.10.2 to Kafka 0.11.0 , We faced a problem with stopping
the kafka service on a broker node.
> Our cluster consists of 6 broker nodes. We had an existing topic when switched to Kafka
0.11.0 . Since then, gracefully stoping the service on a Kafka broker node results in the
following warning message being repeated every 100 ms in the broker log, and the shutdown
takes approximately 45 minutes to complete.
> {code:java}
> @40000000599714da1e582e4c [2017-08-18 16:24:48,509] WARN Connection to node 1002 could
not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
> @40000000599714da245483a4 [2017-08-18 16:24:48,609] WARN Connection to node 1002 could
not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
> @40000000599714da2a51177c [2017-08-18 16:24:48,709] WARN Connection to node 1002 could
not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
> {code}
> Below is the last log lines when the shutdown is complete :
> {code:java}
> @4000000059971afd31113dbc [2017-08-18 16:50:59,823] WARN Connection to node 1002 could
not be established. Broker may not be available. (org.apache.kafka.clients.NetworkClient)
> @4000000059971afd361200bc [2017-08-18 16:50:59,907] INFO Shutdown complete. (kafka.log.LogManager)
> @4000000059971afd36afa04c [2017-08-18 16:50:59,917] INFO Terminate ZkClient event thread.
(org.I0Itec.zkclient.ZkEventThread)
> @4000000059971afd36dd6edc [2017-08-18 16:50:59,920] INFO Session: 0x35d68c9e76702a4 closed
(org.apache.zookeeper.ZooKeeper)
> @4000000059971afd36deca84 [2017-08-18 16:50:59,920] INFO EventThread shut down for session:
0x35d68c9e76702a4 (org.apache.zookeeper.ClientCnxn)
> @4000000059971afd36f6afb4 [2017-08-18 16:50:59,922] INFO [Kafka Server 1002], shut down
completed (kafka.server.KafkaServer)
> {code}
> I should note that I stopped the producers before shutting down the broker.
> If I repeat the process after brining up the service, the shutdown takes less than a
minute. However, if I start the producers even for a short time and repeat the process, it
will again take around 45 minutes to do a graceful shutdown.
> Attached files shows the brokers CPU usage during the shutdown period (light blue curve
is the node in which the broker service is shutting down).
> The size of the topic is 2.3 TB per broker.
> I was wondering if this is an expected behaviour or It is a bug or a misconfiguration?




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message