zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jim Wong <...@airtime.com>
Subject Intermittent Disconnects from C Clients
Date Fri, 28 Aug 2015 00:34:29 GMT
Folks,

We are building an application that talks to ZooKeeper using the C bindings to
enable service discovery.  Our integration is fairly simple and is generally working
well from a functional standpoint, but we are encountering a problem with this new
application that we have not experienced in the past when we used the Java bindings.

What’s happening is that our servers are occasionally (every hour or two) being
disconnected from ZooKeeper and subsequently need to re-connect.  There doesn’t
seem to be any pattern in terms of the timing or frequency of these occurrences.
When the problem occurs, the following is logged:

Aug 27 22:47:55 bixby-i-65e767a3-us-west-2 [2015-Aug-27 22:47:55.326399] [info] [zkLogBridge]
[0x00007f7992bff700] 2015-08-27 22:47:55,326:7312(0x7f7990fff700):ZOO_DEBUG@zookeeper_process@2264:
Got ping response in 74 ms
Aug 27 22:48:01 bixby-i-65e767a3-us-west-2 [2015-Aug-27 22:48:01.993356] [error] [zkLogBridge]
[0x00007f7992bff700] 2015-08-27 22:48:01,993:7312(0x7f7990fff700):ZOO_ERROR@handle_socket_error_msg@1643:
Socket [10.0.19.22:2181] zk retcode=-7, errno=110(Connection timed out): connection to 10.0.19.22:2181
timed out (exceeded timeout by 0ms)
Aug 27 22:48:01 bixby-i-65e767a3-us-west-2 [2015-Aug-27 22:48:01.993409] [info] [zkClient]{0x7f799e446a40}
[0x00007f7992bff700] ZK session watcher with type=Event:Session state=State:Connecting

The last of these is our own log indicating that we received the notification that
the connection had been dropped.

On the ZooKeeper side of things, the logs look like this:

Aug 27 18:36:54 bixby-i-65e767a3-us-west-2 [2015-Aug-27 18:36:54.880795] [info] [zkLogBridge]
[0x00007f7992bff700] 2015-08-27 18:36:54,880:7312(0x7f7990fff700):ZOO_INFO@check_events@1752:
session establishment complete on server [52.5.153.158:2181], sessionId=0x34f6b4380d90006,
negotiated timeout=10000
...
2015-08-27 22:47:55,283 [myid:2] - DEBUG [CommitProcessor:2:FinalRequestProcessor@160] - sessionid:0x34f6b4380d90006
type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a
2015-08-27 22:48:03,095 [myid:2] - DEBUG [ProcessThread(sid:2 cport:-1)::CommitProcessor@171]
- Processing request:: sessionid:0x34f6b4380d90006 type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe
txntype:unknown reqpath:n/a
2015-08-27 22:48:03,095 [myid:2] - DEBUG [CommitProcessor:2:FinalRequestProcessor@88] - Processing
request:: sessionid:0x34f6b4380d90006 type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe
txntype:unknown reqpath:n/a
2015-08-27 22:48:03,095 [myid:2] - DEBUG [CommitProcessor:2:FinalRequestProcessor@160] - sessionid:0x34f6b4380d90006
type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a2015-08-27
22:48:03,095 [myid:2] - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@357]
- caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x34f6b4380d90006,
likely client has closed socket
	at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:228)
	at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
	at java.lang.Thread.run(Thread.java:745
2015-08-27 22:48:03,095 [myid:2] - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1007]
- Closed socket connection for client /54.200.140.196:39721 which had sessionid 0x34f6b4380d90006

So, it looks like the client got annoyed that it didn’t receive a response in time from
the server, but it also looks like the server was generally responding pretty quickly.

It’s likely that we’re doing something wrong here, but I’m at a loss as to what that
might be.  Does anyone have any insights that might explain this behavior?



Mime
View raw message