manifoldcf-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Graeme Seaton (JIRA)" <j...@apache.org>
Subject [jira] [Created] (CONNECTORS-898) Agents fail to start if ZK ensemble member missing
Date Thu, 20 Feb 2014 18:02:19 GMT
Graeme Seaton created CONNECTORS-898:
----------------------------------------

             Summary: Agents fail to start if ZK ensemble member missing
                 Key: CONNECTORS-898
                 URL: https://issues.apache.org/jira/browse/CONNECTORS-898
             Project: ManifoldCF
          Issue Type: Bug
          Components: Framework agents process
    Affects Versions: ManifoldCF 1.5
         Environment: 4 Agents
3 member ZK ensemble (2 live, 1 dead)
            Reporter: Graeme Seaton


If a member of the ZK ensemble is down but there is still a majority of members active so
that ZK is 'live' then when the agents startup any agents that try to connect to the missing
member abort with:

Opening socket connection to server overlorddev03/10.250.0.36:2181. Will not att
empt to authenticate using SASL (unknown error)
71 [main-SendThread(overlorddev03:2181)] WARN org.apache.zookeeper.ClientCnxn - 
Session 0x0 for server null, unexpected error, closing socket connection and att
empting reconnect
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:735
)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocket
NIO.java:350)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)

followed by:

org.apache.manifoldcf.core.interfaces.ManifoldCFException: Initialization failed: KeeperErrorCode
= ConnectionLoss for /org.apache.manifoldcf.configuration
        at org.apache.manifoldcf.core.system.ManifoldCF.initializeEnvironment(ManifoldCF.java:269)
        at org.apache.manifoldcf.agents.system.ManifoldCF.initializeEnvironment(ManifoldCF.java:43)
        at org.apache.manifoldcf.agents.BaseAgentsInitializationCommand.execute(BaseAgentsInitializationCommand.java:36)
        at org.apache.manifoldcf.agents.AgentRun.main(AgentRun.java:93)

This has a knock affect to the other agents which then eventually fail with 'agents process
could not start - shutting down'.  

Besides exceptions of this type:

5401 [main-SendThread(overlorddev03:2181)] INFO org.apache.zookeeper.ClientCnxn 
- Opening socket connection to server overlorddev03/10.250.0.36:2181. Will not a
ttempt to authenticate using SASL (unknown error)
5403 [main-SendThread(overlorddev03:2181)] WARN org.apache.zookeeper.ClientCnxn 
- Session 0x0 for server null, unexpected error, closing socket connection and a
ttempting reconnect
java.net.ConnectException: Connection refused
        at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
        at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:735
)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocket
NIO.java:350)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1068)
5506 [main-SendThread(overlorddev04:2181)] INFO org.apache.zookeeper.ClientCnxn - Opening
socket connection to server overlorddev04/10.250.0.46:2181. Will not attempt to authenticate
using SASL (unknown error)
5507 [main-SendThread(overlorddev04:2181)] INFO org.apache.zookeeper.ClientCnxn - Socket connection
established to overlorddev04/10.250.0.46:2181, initiating session

the only other notable exception is:

5509 [main-SendThread(overlorddev04:2181)] INFO org.apache.zookeeper.ClientCnxn 
- Session establishment complete on server overlorddev04/10.250.0.46:2181, sessi
onid = 0x4444f2cb0590087, negotiated timeout = 8000
org.apache.manifoldcf.core.interfaces.ManifoldCFException: KeeperErrorCode = Con
nectionLoss for /org.apache.manifoldcf.flags-_AGENTRUN_
        at org.apache.manifoldcf.core.lockmanager.ZooKeeperConnection.checkGlobalFlag(ZooKeeperConnection.java:499)
        at org.apache.manifoldcf.core.lockmanager.ZooKeeperLockManager.checkGlobalFlag(ZooKeeperLockManager.java:787)
        at org.apache.manifoldcf.agents.system.AgentsDaemon.runAgents(AgentsDaemon.java:110)
        at org.apache.manifoldcf.agents.AgentRun.doExecute(AgentRun.java:64)
        at org.apache.manifoldcf.agents.BaseAgentsInitializationCommand.execute(BaseAgentsInitializationCommand.java:37)
        at org.apache.manifoldcf.agents.AgentRun.main(AgentRun.java:93)






--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Mime
View raw message