Hi Sampath, When a server starts it tries to contact the others immediately; it backs off if it gets no response. 

It is true that it is unlikely that servers will start at the same time and you'll get such warnings. However, I don't really see the point of setting such a configuration parameter. It is really difficult to estimate how much time is sufficient, so most likely you'll end up getting the warning anyway if you make an aggressive estimate or will wait more than necessary if you make a conservative estimate.

-Flavio

On Aug 18, 2011, at 5:40 AM, Sampath Perera wrote:

Hi,

We have a deployment of a 3 node ZooKeeper quorum. When we get to starting
the 3 ZooKeeper nodes the first node getting started prints the following
connection refused exception, which is true as the node 2 and 3 are yet to
be started. This seems to be because of the FastLeaderElection trying to
connect to the other nodes specified in the quorum.

So my question is whether it is possible to configure an initial delay for
the FastLeaderElection to be kicked off?

The rationale being that it is highly unlikely that all 3 nodes started at
the same time, even in the case where we try to command the startups at the
same time, and we could get rid of this stacktrace from the logs, as this
will trigger warning on the tools that are monitoring the logs, yet is not
actually a WARN rather an expected error.

2011-08-18 08:53:15,530 [-] [WorkerSender Thread]  WARN QuorumCnxManager
Cannot open channel to 2 at election address localhost/127.0.0.1:3888
java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
   at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100)
   at
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:371)
   at
org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:340)
   at
org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:360)
   at
org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:333)
   at java.lang.Thread.run(Thread.java:662)
2011-08-18 08:53:15,532 [-] [WorkerSender Thread]  WARN QuorumCnxManager
Cannot open channel to 3 at election address localhost/127.0.0.1:3889
java.net.ConnectException: Connection refused
   at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
   at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
   at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100)
   at
org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:371)
   at
org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:340)
   at
org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:360)
   at
org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:333)
   at java.lang.Thread.run(Thread.java:662)

--
Thanks,
Sampath
http://adroitlogic.org

flavio
junqueira
 
research scientist
 
fpj@yahoo-inc.com
direct +34 93-183-8828
 
avinguda diagonal 177, 8th floor, barcelona, 08018, es
phone (408) 349 3300    fax (408) 349 3301