zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@apache.org>
Subject Re: zk keeps disconnecting and reconnecting
Date Thu, 25 Aug 2011 16:34:39 GMT
The client seeing the problem in this case is 3.3.0, I see this based
on the line number in the stack trace not matching up with 3.3.3, with
3.3.0 it's this line:

           event.setPath(serverPath.substring(chrootPath.length()));

so for some reason your chroot path is negative in length? That's just
not possible (string.length() should never return negative).

What JVM are you using? What's your client connect string look like?

Patrick

On Tue, Aug 23, 2011 at 2:58 PM, Jun Rao <junrao@gmail.com> wrote:
> I have a ZK server cluster running on 4 nodes (version 3.3.3) and a few ZK
> clients (version 3.3.0). After the clients have been running for a while,
> each of them starts to constantly disconnect and reconnect to the ZK server.
> On the client, I saw lots of entries like the following:
> 2011/08/23 14:42:06.579 INFO [ClientCnxn]
> [main-SendThread(esv4-app27.stg:12913)] [kafka] Opening socket connection to
> server esv4-app28.stg/172.18.98.101:12913
> 2011/08/23 14:42:06.579 INFO [ClientCnxn]
> [main-SendThread(esv4-app28.stg:12913)] [kafka] Socket connection
> established to esv4-app28.stg/172.18.98.101:12913, initiating session
> 2011/08/23 14:42:06.581 INFO [ClientCnxn]
> [main-SendThread(esv4-app28.stg:12913)] [kafka] Session establishment
> complete on server esv4-app28.stg/172.18.98.101:12913, sessionid =
> 0x331f77a1ed80004, negotiated timeout = 6000
> 2011/08/23 14:42:06.581 INFO [ZkClient] [main-EventThread] [kafka] zookeeper
> state changed (SyncConnected)
> 2011/08/23 14:42:06.583 WARN [ClientCnxn]
> [main-SendThread(esv4-app28.stg:12913)] [kafka] Session 0x331f77a1ed80004
> for server esv4-app28.stg/172.18.98.101:12913, unexpected error, closing
> socket connection and attempting reconnect
> java.lang.StringIndexOutOfBoundsException: String index out of range:
> -3        at java.lang.String.substring(String.java:1937)
>        at java.lang.String.substring(String.java:1904)
>        at
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:753)
> at org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:840)
>        at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1089)
> 2011/08/23 14:42:06.683 INFO [ZkClient] [main-EventThread] [kafka] zookeeper
> state changed (Disconnected)
> 2011/08/23 14:42:07.510 INFO [ClientCnxn]
> [main-SendThread(esv4-app28.stg:12913)] [kafka] Opening socket connection to
> server esv4-app29.stg/172.18.98.89:12913
> 2011/08/23 14:42:07.511 INFO [ClientCnxn]
> [main-SendThread(esv4-app29.stg:12913)] [kafka] Socket connection
> established to esv4-app29.stg/172.18.98.89:12913, initiating session
> 2011/08/23 14:42:07.512 INFO [ClientCnxn]
> [main-SendThread(esv4-app29.stg:12913)] [kafka] Session establishment
> complete on server esv4-app29.stg/172.18.98.89:12913, sessionid = 0x331f
> 77a1ed80004, negotiated timeout = 6000
> 2011/08/23 14:42:07.513 INFO [ZkClient] [main-EventThread] [kafka] zookeeper
> state changed (SyncConnected)
> 2011/08/23 14:42:07.552 WARN [ClientCnxn]
> [main-SendThread(esv4-app29.stg:12913)] [kafka] Session 0x331f77a1ed80004
> for server esv4-app29.stg/172.18.98.89:12913, unexpected error, clos
> ing socket connection and attempting reconnect
> java.lang.StringIndexOutOfBoundsException: String index out of range:
> -3        at java.lang.String.substring(String.java:1937)
>        at java.lang.String.substring(String.java:1904)
>        at
> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:753)
> at org.apache.zookeeper.ClientCnxn$SendThread.doIO(ClientCnxn.java:840)
>        at
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1089)
> 2011/08/23 14:42:07.653 INFO [ZkClient] [main-EventThread] [kafka] zookeeper
> state changed (Disconnected)
>
> On the ZK server, I saw lots of entries like these:
> 2011-08-23 14:34:28,802 - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:12913:NIOServerCnxn$Factory@251] - Accepted socket
> connection from /172.18.98.95:345072011-08-23 14:34:28,803 - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:12913:NIOServerCnxn@770] - Client
> attempting to renew session 0x231db9b5f3f010e at /172.18.98.95:34507
> 2011-08-23 14:34:28,803 - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:12913:Learner@103] - Revalidating client:
> 1581489222990563982011-08-23 14:34:28,803 - INFO
> [QuorumPeer:/0:0:0:0:0:0:0:0:12913:NIOServerCnxn@1580] - Established session
> 0x231db9b5f3f010e with negotiated timeout 6000 for client /172.18.98.95:345
> 072011-08-23 14:34:28,805 - WARN  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:12913:NIOServerCnxn@634] - EndOfStreamException: Unable to
> read additional data from client sessionid 0x231db9b5f3
> f010e, likely client has closed socket
> 2011-08-23 14:34:28,805 - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:12913:NIOServerCnxn@1435] - Closed socket connection for
> client /172.18.98.95:34507 which had sessionid 0x231db9b5
> f3f010e
> 2011-08-23 14:34:29,512 - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:12913:NIOServerCnxn$Factory@251] - Accepted socket
> connection from /172.18.98.94:588812011-08-23 14:34:29,512 - INFO
> [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:12913:NIOServerCnxn@770] - Client
> attempting to renew session 0x231db9b5f3f010c at /172.18.98.94:58881
> 2011-08-23 14:34:29,512 - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:12913:Learner@103] - Revalidating client:
> 1581489222990563962011-08-23 14:34:29,513 - INFO
> [QuorumPeer:/0:0:0:0:0:0:0:0:12913:NIOServerCnxn@1580] - Established session
> 0x231db9b5f3f010c with negotiated timeout 6000 for client /172.18.98.94:588
> 812011-08-23 14:34:29,514 - WARN  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:12913:NIOServerCnxn@634] - EndOfStreamException: Unable to
> read additional data from client sessionid 0x231db9b5f3
> f010c, likely client has closed socket
> 2011-08-23 14:34:29,514 - INFO  [NIOServerCxn.Factory:
> 0.0.0.0/0.0.0.0:12913:NIOServerCnxn@1435] - Closed socket connection for
> client /172.18.98.94:58881 which had sessionid 0x231db9b5
> f3f010c
>
> Anybody knows what could be the problem? A few other things that may help
> understand the problem: each ZK client is also connecting to another cluster
> of ZK servers in the same JVM and all the ZK connections use the ZK
> namespace (chroot).
>
> Thanks,
>
> Jun
>

Mime
View raw message