curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Eric Tschetter (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CURATOR-36) Bad session, infinite connection loop from Curator
Date Tue, 18 Jun 2013 20:10:20 GMT
Eric Tschetter created CURATOR-36:
-------------------------------------

             Summary: Bad session, infinite connection loop from Curator
                 Key: CURATOR-36
                 URL: https://issues.apache.org/jira/browse/CURATOR-36
             Project: Apache Curator
          Issue Type: Bug
          Components: Framework
    Affects Versions: 2.0.1-incubating
            Reporter: Eric Tschetter


On the ZK clients that I am running Curator on, we sometimes see reconnect loops like the
following.  These are infinite and happen until the process is restarted.

2013-06-18 19:57:28,660 INFO [main-EventThread] org.apache.curator.framework.state.ConnectionStateManager
- State change: RECONNECTED
2013-06-18 19:57:28,660 WARN [main-EventThread] org.apache.curator.framework.state.ConnectionStateManager
- ConnectionStateManager queue full - dropping events to make room
2013-06-18 19:57:28,786 INFO [main-EventThread] org.apache.curator.framework.state.ConnectionStateManager
- State change: SUSPENDED
2013-06-18 19:57:28,786 WARN [main-EventThread] org.apache.curator.framework.state.ConnectionStateManager
- ConnectionStateManager queue full - dropping events to make room
2013-06-18 19:57:29,048 INFO [main-SendThread(ip-10:2181)] org.apache.zookeeper.ClientCnxn
- Opening socket connection to server ip-10/10.:2181. Will not attempt to authenticate using
SASL (Unable to locate a login configuration)
2013-06-18 19:57:29,049 INFO [main-SendThread(ip-10:2181)] org.apache.zookeeper.ClientCnxn
- Socket connection established to ip-10/10.:2181, initiating session
2013-06-18 19:57:29,160 WARN [main-SendThread(ip-10:2181)] org.apache.zookeeper.ClientCnxnSocket
- Connected to an old server; r-o mode will be unavailable
2013-06-18 19:57:29,160 INFO [main-SendThread(ip-10:2181)] org.apache.zookeeper.ClientCnxn
- Session establishment complete on server ip-10/10.:2181, sessionid = 0x63f5865925e0010,
negotiated timeout = 30000
2013-06-18 19:57:29,177 INFO [main-EventThread] org.apache.curator.framework.state.ConnectionStateManager
- State change: RECONNECTED


Looking on the ZK side, it looks like

2013-06-18 20:07:31,215 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1580]
- Established session 0x63f5865925e0010 with negotiated timeout 30000 for client /10.:56263
2013-06-18 20:07:31,324 - WARN  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@639]
- Exception causing close of session 0x63f5865925e0010 due to java.io.IOException: Len error
6736057
2013-06-18 20:07:31,325 - INFO  [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1435]
- Closed socket connection for client /10.:56263 which had sessionid 0x63f5865925e0010

So, there appears to be some issue with trying to recover the session.  I don't know exactly
what is causing that issue recovering the session, but it would be awesome if Curator were
able to notice that it's failing at getting its session back and just try to make a brand
new connection.

It appears like this might be doable in reaction to the ConnectionStateManager queue filling
up?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message