zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andor Molnar <an...@apache.org>
Subject Re: client session expired after timeout after errors and warning logs in zk server logs
Date Mon, 01 Jul 2019 13:08:29 GMT
Hi Prashant,

It’s quite difficult to track what could have happened from your log file snippet. Look’s
like there’re multiple things going on at the same time: dynamic reconfig, leader election,
network issues?

However you’re using an early alpha version of ZooKeeper which is not recommended for production
use. I suggest to upgrade to a recent stable release (3.5.5) and see if the problem still
occurs.

Regards,
Andor




> On 2019. Jun 20., at 13:44, prashantkumar dhotre <prashantkumardhotre@gmail.com>
wrote:
> 
> Hi
> We use 3.5.1-alpha version.
> We are seeing session expiry issue in VM set up.
> This is running in replicated more (two servers + node mastership as one
> vote for quorum).
> we see client session expired after session timeout (of 10 sec).
> This connection was to local zk server.
> session timeout is 10 sec.
> 
> This session got established at 17:40:18 and ZK server expired this at
> 17:40:57, after 39 seconds of establishment.
> 
> in between this time, i see few errors and warnings in zookeeper server
> logs (as shown below).
> 
> I see below errors/warning in between this time before session expiry.
> 
> This issue is not very easy to replicate , so far we have seen only twice.
> 
> Could you please help me identify root cause and let me know if this is
> fixed in later release ?
> 
> Thanks,
> 
> Prashant
> 
> 
> Logs:
> 
> 
> 
> ZK server errors and warnings in ZK server logs:
> 
> 
> 
> 2711:2019-06-05 17:40:38,766 [myid:2147483652] - INFO  [
> 0.0.0.0/0.0.0.0:61864:QuorumCnxManager$Listener@637] - Received connection
> request /128.0.0.5:38020
> 
> 2724:2019-06-05 17:40:38,931 [myid:2147483652] - WARN
> [RecvWorker:2147483653:QuorumCnxManager$RecvWorker@917] - Connection broken
> for id 2147483653, my id = 2147483652, error =
> 
> 2725:java.io.EOFException
> 
> 2726:   at java.io.DataInputStream.readInt(Unknown Source)
> 
> 2727:   at
> org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:902)
> 
> 2728:2019-06-05 17:40:38,943 [myid:2147483652] - WARN
> [RecvWorker:2147483653:QuorumCnxManager$RecvWorker@920] - Interrupting
> SendWorker
> 
> 2730:2019-06-05 17:40:38,950 [myid:2147483652] - WARN
> [SendWorker:2147483653:QuorumCnxManager$SendWorker@834] - Interrupted while
> waiting for message on queue
> 
> 2731:java.lang.InterruptedException
> 
> 2732:   at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(Unknown
> Source)
> 
> 2733:   at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(Unknown
> Source)
> 
> 2734:   at java.util.concurrent.ArrayBlockingQueue.poll(Unknown Source)
> 
> 2735:   at
> org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:986)
> 
> 2736:   at
> org.apache.zookeeper.server.quorum.QuorumCnxManager.access$500(QuorumCnxManager.java:63)
> 
> 2737:   at
> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:822)
> 
> 2738:2019-06-05 17:40:38,965 [myid:2147483652] - WARN
> [SendWorker:2147483653:QuorumCnxManager$SendWorker@843] - Send worker
> leaving thread  id 2147483653 my id = 2147483652
> 
> 2740:2019-06-05 17:40:38,978 [myid:2147483652] - INFO  [
> 0.0.0.0/0.0.0.0:61864:QuorumCnxManager$Listener@637] - Received connection
> request /128.0.0.5:38022
> 
> 2741:2019-06-05 17:40:38,986 [myid:2147483652] - INFO  [
> 0.0.0.0/0.0.0.0:61864:QuorumCnxManager$Listener@637] - Received connection
> request /128.0.0.5:38024
> 
> 2742:2019-06-05 17:40:38,987 [myid:2147483652] - WARN
> [RecvWorker:2147483653:QuorumCnxManager$RecvWorker@920] - Interrupting
> SendWorker
> 
> 2743:2019-06-05 17:40:39,017 [myid:2147483652] - ERROR
> [SendWorker:2147483653:QuorumCnxManager$SendWorker@810] - Failed to send
> last message. Shutting down thread.
> 
> 2744:java.net.SocketException: Socket closed
> 
> 2745:   at java.net.SocketOutputStream.socketWrite(Unknown Source)
> 
> 2746:   at java.net.SocketOutputStream.write(Unknown Source)
> 
> 2747:   at java.io.DataOutputStream.writeInt(Unknown Source)
> 
> 2748:   at
> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.send(QuorumCnxManager.java:779)
> 
> 2749:   at
> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:806)
> 
> 2750:2019-06-05 17:40:39,018 [myid:2147483652] - WARN
> [SendWorker:2147483653:QuorumCnxManager$SendWorker@843] - Send worker
> leaving thread  id 2147483653 my id = 2147483652
> 
> 2758:2019-06-05 17:40:39,125 [myid:2147483652] - INFO
> [LearnerHandler-/128.0.0.5:49888:LearnerHandler@385] - Follower sid:
> 2147483653 not in the current config 100000000
> 
> 2759:2019-06-05 17:40:39,130 [myid:2147483652] - INFO
> [LearnerHandler-/128.0.0.5:49888:LearnerHandler@683] - Synchronizing with
> Follower sid: 2147483653 maxCommittedLog=0x100000e04 minCommittedLog=0x1000
> 
> 00c10 lastProcessedZxid=0x100000e04 peerLastZxid=0x0
> 
> 2760:2019-06-05 17:40:39,132 [myid:2147483652] - WARN
> [LearnerHandler-/128.0.0.5:49888:ZKDatabase@310] - Unable to find proposals
> from txnlog for zxid: 0
> 
> 2762:2019-06-05 17:40:39,134 [myid:2147483652] - INFO
> [LearnerHandler-/128.0.0.5:49888:LearnerHandler@455] - Sending snapshot
> last zxid of peer is 0x0, zxid of leader is 0x100000e0b, send zxid of db as
> 0
> 
> x100000e05, 1 concurrent snapshots, snapshot was exempt from throttle
> 
> 2814:2019-06-05 17:40:39,438 [myid:2147483652] - INFO
> [ProcessThread(sid:2147483652 cport:-1)::PrepRequestProcessor@512] -
> Incremental reconfig
> 
> 2815:2019-06-05 17:40:39,454 [myid:2147483652] - WARN
> [LearnerHandler-/128.0.0.5:49888:QuorumPeer@1370] - Restarting Leader
> Election
> 
> 2816:2019-06-05 17:40:39,474 [myid:2147483652] - INFO  [
> 0.0.0.0/0.0.0.0:61864:QuorumCnxManager$Listener@659] - Leaving listener
> 
> 2818:2019-06-05 17:40:39,494 [myid:2147483652] - WARN
> [SendWorker:2147483653:QuorumCnxManager$SendWorker@834] - Interrupted while
> waiting for message on queue
> 
> 2819:java.lang.InterruptedException
> 
> 2820:   at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(Unknown
> Source)
> 
> 2821:   at
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(Unknown
> Source)
> 
> 2822:   at java.util.concurrent.ArrayBlockingQueue.poll(Unknown Source)
> 
> 2823:   at
> org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:986)
> 
> 2824:   at
> org.apache.zookeeper.server.quorum.QuorumCnxManager.access$500(QuorumCnxManager.java:63)
> 
> 2825:   at
> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:822)
> 
> 2826:2019-06-05 17:40:39,496 [myid:2147483652] - WARN
> [RecvWorker:2147483653:QuorumCnxManager$RecvWorker@917] - Connection broken
> for id 2147483653, my id = 2147483652, error =
> 
> 2827:java.net.SocketException: Socket closed
> 
> 2828:   at java.net.SocketInputStream.socketRead0(Native Method)
> 
> 2829:   at java.net.SocketInputStream.socketRead(Unknown Source)
> 
> 2830:   at java.net.SocketInputStream.read(Unknown Source)
> 
> 2831:   at java.net.SocketInputStream.read(Unknown Source)
> 
> 2832:   at java.net.SocketInputStream.read(Unknown Source)
> 
> 2833:   at java.io.DataInputStream.readInt(Unknown Source)
> 
> 2834:   at
> org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:902)
> 
> 2835:2019-06-05 17:40:39,496 [myid:2147483652] - WARN
> [RecvWorker:2147483653:QuorumCnxManager$RecvWorker@920] - Interrupting
> SendWorker
> 
> 2836:2019-06-05 17:40:39,499 [myid:2147483652] - WARN
> [SendWorker:2147483653:QuorumCnxManager$SendWorker@843] - Send worker
> leaving thread  id 2147483653 my id = 2147483652
> 
> 2837:2019-06-05 17:40:39,499 [myid:2147483652] - INFO  [
> QuorumPeerListener:QuorumCnxManager$Listener@631] - My election bind port:
> 0.0.0.0/0.0.0.0:61864
> 
> 3072:2019-06-05 17:40:42,014 [myid:2147483652] - INFO
> [WorkerSender[myid=2147483652]:FastLeaderElection$Messenger$WorkerSender@470]
> - WorkerSender is down
> 
> 3073:2019-06-05 17:40:42,014 [myid:2147483652] - INFO
> [WorkerReceiver[myid=2147483652]:FastLeaderElection$Messenger$WorkerReceiver@440]
> - WorkerReceiver is down
> 
> 3178:2019-06-05 17:40:42,580 [myid:2147483652] - INFO
> [NIOServerCxnFactory.AcceptThread:/0.0.0.0:61862
> :NIOServerCnxnFactory$AcceptThread@296] - Accepted socket connection from /
> 128.0.0.5:35418
> 
> 3179:2019-06-05 17:40:42,583 [myid:2147483652] - INFO
> [NIOWorkerThread-6:ZooKeeperServer@964] - Client attempting to establish
> new session at /128.0.0.5:35418


Mime
View raw message