zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Aishwarya Soni (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ZOOKEEPER-3036) Unexpected exception in zookeeper
Date Thu, 11 Oct 2018 03:35:00 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16645913#comment-16645913
] 

Aishwarya Soni commented on ZOOKEEPER-3036:
-------------------------------------------

We got the same issue a couple of days back. We are running zookeeper in a containerized AWS
environment and we had to restart the problem container to get the above issue resolved. The
issue comes when the port binding doesn't happen. When the container becomes unhealthy, it
doesn't release the port and when it tries to bind to that port to join the quorum, as the
port was already in use and never released, it throws the exception of *Unexpected exception
causing shutdown while sock still open*

> Unexpected exception in zookeeper
> ---------------------------------
>
>                 Key: ZOOKEEPER-3036
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-3036
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: quorum, server
>    Affects Versions: 3.4.10
>         Environment: 3 Zookeepers, 5 kafka servers
>            Reporter: Oded
>            Priority: Critical
>
> We got an issue with one of the zookeeprs (Leader), causing the entire kafka cluster
to fail:
> 2018-05-09 02:29:01,730 [myid:3] - ERROR [LearnerHandler-/192.168.0.91:42490:LearnerHandler@648]
- Unexpected exception causing shutdown while sock still open
> java.net.SocketTimeoutException: Read timed out
>         at java.net.SocketInputStream.socketRead0(Native Method)
>         at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
>         at java.net.SocketInputStream.read(SocketInputStream.java:171)
>         at java.net.SocketInputStream.read(SocketInputStream.java:141)
>         at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
>         at java.io.BufferedInputStream.read(BufferedInputStream.java:265)
>         at java.io.DataInputStream.readInt(DataInputStream.java:387)
>         at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
>         at org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:83)
>         at org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:99)
>         at org.apache.zookeeper.server.quorum.LearnerHandler.run(LearnerHandler.java:559)
> 2018-05-09 02:29:01,730 [myid:3] - WARN  [LearnerHandler-/192.168.0.91:42490:LearnerHandler@661]
- ******* GOODBYE /192.168.0.91:42490 ********
>  
> We would expect that zookeeper will choose another Leader and the Kafka cluster will
continue to work as expected, but that was not the case.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message