kafka-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Candice Wan (JIRA)" <j...@apache.org>
Subject [jira] [Created] (KAFKA-8188) Zookeeper Connection Issue Take Down the Whole kafka cluster
Date Thu, 04 Apr 2019 02:36:00 GMT
Candice Wan created KAFKA-8188:
----------------------------------

             Summary: Zookeeper Connection Issue Take Down the Whole kafka cluster
                 Key: KAFKA-8188
                 URL: https://issues.apache.org/jira/browse/KAFKA-8188
             Project: Kafka
          Issue Type: Bug
          Components: core
    Affects Versions: 2.1.1
            Reporter: Candice Wan
         Attachments: thread_dump.log

We recently upgraded to 2.1.1 and we saw below zookeeper connection issues which took down
the whole cluster. We've got 3 nodes in the cluster, 2 of which had issues.

2019-04-03 08:25:19.603 [main-SendThread(iaase00003184.svr.emea.jpmchase.net:36100)] WARN
org.apache.zookeeper.ClientCnxn - Unable to reconnect to ZooKeeper service, session 0x10071ff9baf0001
has expired
2019-04-03 08:25:19.603 [main-SendThread(iaase00003184.svr.emea.jpmchase.net:36100)] INFO
org.apache.zookeeper.ClientCnxn - Unable to reconnect to ZooKeeper service, session 0x10071ff9baf0001
has expired, closing socket connection
2019-04-03 08:25:19.605 [main-EventThread] INFO org.apache.zookeeper.ClientCnxn - EventThread
shut down for session: 0x10071ff9baf0001
2019-04-03 08:25:19.605 [zk-session-expiry-handler0] INFO kafka.zookeeper.ZooKeeperClient
- [ZooKeeperClient] Session expired.
2019-04-03 08:25:19.609 [zk-session-expiry-handler0] INFO kafka.zookeeper.ZooKeeperClient
- [ZooKeeperClient] Initializing a new session to vsie5p0551.svr.emea.jpmchase.net:36100,iaase00003184.svr.emea.jpmchase.net:36100,iaase00003360.svr.emea.jpmchase.net:36100.
2019-04-03 08:25:19.610 [zk-session-expiry-handler0] INFO org.apache.zookeeper.ZooKeeper -
Initiating client connection, connectString=vsie5p0551.svr.emea.jpmchase.net:36100,iaase00003184.svr.emea.jpmchase.net:36100,iaase00003360.svr.emea.jpmchase.net:36100
sessionTimeout=6000 watcher=kafka.zookeeper.ZooKeeperClient$ZooKeeperClientWatcher$@12f8b1d8
2019-04-03 08:25:19.610 [zk-session-expiry-handler0] INFO o.apache.zookeeper.ClientCnxnSocket
- jute.maxbuffer value is 4194304 Bytes
2019-04-03 08:25:19.611 [zk-session-expiry-handler0-SendThread(vsie5p0551.svr.emea.jpmchase.net:36100)]
WARN org.apache.zookeeper.ClientCnxn - SASL configuration failed: javax.security.auth.login.LoginException:
No JAAS configuration section named 'Client' was found in specified JAAS configuration file:
'file:/app0/common/config/ldap-auth.config'. Will continue connection to Zookeeper server
without SASL authentication, if Zookeeper server allows it.
2019-04-03 08:25:19.611 [zk-session-expiry-handler0-SendThread(vsie5p0551.svr.emea.jpmchase.net:36100)]
INFO org.apache.zookeeper.ClientCnxn - Opening socket connection to server vsie5p0551.svr.emea.jpmchase.net/169.30.47.206:36100
2019-04-03 08:25:19.611 [zk-session-expiry-handler0-EventThread] ERROR kafka.zookeeper.ZooKeeperClient
- [ZooKeeperClient] Auth failed.
2019-04-03 08:25:19.611 [zk-session-expiry-handler0-SendThread(vsie5p0551.svr.emea.jpmchase.net:36100)]
INFO org.apache.zookeeper.ClientCnxn - Socket connection established, initiating session,
client: /169.20.222.18:56876, server: vsie5p0551.svr.emea.jpmchase.net/169.30.47.206:36100
2019-04-03 08:25:19.612 [controller-event-thread] INFO k.controller.PartitionStateMachine
- [PartitionStateMachine controllerId=3] Stopped partition state machine
2019-04-03 08:25:19.613 [controller-event-thread] INFO kafka.controller.ReplicaStateMachine
- [ReplicaStateMachine controllerId=3] Stopped replica state machine
2019-04-03 08:25:19.614 [controller-event-thread] INFO kafka.controller.KafkaController -
[Controller id=3] Resigned
2019-04-03 08:25:19.615 [controller-event-thread] INFO kafka.zk.KafkaZkClient - Creating /brokers/ids/3
(is it secure? false)
2019-04-03 08:25:19.628 [zk-session-expiry-handler0-SendThread(vsie5p0551.svr.emea.jpmchase.net:36100)]
INFO org.apache.zookeeper.ClientCnxn - Session establishment complete on server vsie5p0551.svr.emea.jpmchase.net/169.30.47.206:36100,
sessionid = 0x1007f4d2b810000, negotiated timeout = 6000
2019-04-03 08:25:19.631 [/config/changes-event-process-thread] INFO k.c.ZkNodeChangeNotificationListener
- Processing notification(s) to /config/changes
2019-04-03 08:25:19.637 [controller-event-thread] ERROR k.zk.KafkaZkClient$CheckedEphemeral
- Error while creating ephemeral at /brokers/ids/3, node already exists and owner '72182936680464385'
does not match current session '72197563457011712'
2019-04-03 08:25:19.637 [controller-event-thread] INFO kafka.zk.KafkaZkClient - Result of
znode creation at /brokers/ids/3 is: NODEEXISTS
2019-04-03 08:25:19.644 [controller-event-thread] ERROR k.c.ControllerEventManager$ControllerEventThread
- [ControllerEventThread controllerId=3] Error processing event RegisterBrokerAndReelect
org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists
 at org.apache.zookeeper.KeeperException.create(KeeperException.java:126)
 at kafka.zk.KafkaZkClient.checkedEphemeralCreate(KafkaZkClient.scala:1631)
 at kafka.zk.KafkaZkClient.registerBroker(KafkaZkClient.scala:87)
 at kafka.controller.KafkaController$RegisterBrokerAndReelect$.process(KafkaController.scala:1516)
 at kafka.controller.ControllerEventManager$ControllerEventThread.$anonfun$doWork$1(ControllerEventManager.scala:89)
 at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)
 at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:31)
 at kafka.controller.ControllerEventManager$ControllerEventThread.doWork(ControllerEventManager.scala:89)
 at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:82)

 

Thread dump attached

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message