Return-Path: Delivered-To: apmail-hadoop-zookeeper-user-archive@minotaur.apache.org Received: (qmail 18619 invoked from network); 1 Oct 2009 23:58:42 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 1 Oct 2009 23:58:42 -0000 Received: (qmail 13272 invoked by uid 500); 1 Oct 2009 23:58:41 -0000 Delivered-To: apmail-hadoop-zookeeper-user-archive@hadoop.apache.org Received: (qmail 13217 invoked by uid 500); 1 Oct 2009 23:58:41 -0000 Mailing-List: contact zookeeper-user-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: zookeeper-user@hadoop.apache.org Delivered-To: mailing list zookeeper-user@hadoop.apache.org Received: (qmail 13207 invoked by uid 99); 1 Oct 2009 23:58:41 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Oct 2009 23:58:41 +0000 X-ASF-Spam-Status: No, hits=1.2 required=10.0 tests=SPF_NEUTRAL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: 216.145.54.173 is neither permitted nor denied by domain of phunt@apache.org) Received: from [216.145.54.173] (HELO mrout3.yahoo.com) (216.145.54.173) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 01 Oct 2009 23:58:31 +0000 Received: from [10.73.135.251] (wifi-e-135-251.corp.yahoo.com [10.73.135.251]) by mrout3.yahoo.com (8.13.6/8.13.6/y.out) with ESMTP id n91Nu1bQ002076; Thu, 1 Oct 2009 16:56:02 -0700 (PDT) Message-ID: <4AC54191.6060400@apache.org> Date: Thu, 01 Oct 2009 16:56:01 -0700 From: Patrick Hunt User-Agent: Thunderbird 2.0.0.23 (X11/20090817) MIME-Version: 1.0 To: zookeeper-user@hadoop.apache.org, hector@nimblestorage.com Subject: Re: problem starting ensemble mode References: In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org Hi Hector, looks like a connectivity issue to me: NoRouteToHostException. 3888 is the election port 2888 is the quorum port basically, the ensemble uses the election port for leader election. Once a leader is elected it then uses the quorum port for subsequent communication. Could it be a firewall issue? Your configs/logs look ok to me otw. Try using something like telnet to verify connectivity on the 3888 & 2888 ports between the two servers. Patrick Hector Yuen wrote: > Hi all, > > I am trying to start zookeeper in two nodes, the configuration file I have > is > > tickTime=2000 > initLimit=10 > syncLimit=5 > dataDir=/var/zookeeper > clientPort=2181 > server.1=hec-bp1:2888:3888 > server.2=hec-bp2:2888:3888 > > > i also have two files /var/zookeeper/myid on each of the machines, the > files contain 1 and 2 on each of the servers > > > When I start, I get the following > > Starting zookeeper ... > STARTED > hector@hec-bp2:/zookeeper$ 2009-10-01 15:48:15,786 - INFO > [main:QuorumPeerConfig@80] - Reading configuration from: > /zookeeper/bin/../conf/zoo.cfg > 2009-10-01 15:48:15,882 - INFO [main:QuorumPeerConfig@232] - Defaulting to > majority quorums > 2009-10-01 15:48:15,899 - INFO [main:QuorumPeerMain@118] - Starting quorum > peer > 2009-10-01 15:48:15,943 - INFO [Thread-1:QuorumCnxManager$Listener@409] - > My election bind port: 3888 > 2009-10-01 15:48:15,961 - INFO > [QuorumPeer:/0:0:0:0:0:0:0:0:2181:QuorumPeer@487] - LOOKING > 2009-10-01 15:48:15,963 - INFO > [QuorumPeer:/0:0:0:0:0:0:0:0:2181:FastLeaderElection@579] - New election: -1 > 2009-10-01 15:48:15,978 - WARN [WorkerSender Thread:QuorumCnxManager@336] - > Cannot open channel to 1 at election address > hec-bp1.admin.nimblestorage.com/10.12.6.192:3888 > java.net.NoRouteToHostException: No route to host > at sun.nio.ch.Net.connect(Native Method) > at sun.nio.ch.SocketChannelImpl.connect(Unknown Source) > at java.nio.channels.SocketChannel.open(Unknown Source) > at > org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnxManager.java:323) > at > org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxManager.java:302) > at > org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.process(FastLeaderElection.java:323) > at > org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSender.run(FastLeaderElection.java:296) > at java.lang.Thread.run(Unknown Source) > 2009-10-01 15:48:15,981 - INFO > [QuorumPeer:/0:0:0:0:0:0:0:0:2181:FastLeaderElection@618] - Notification: 2, > -1, 1, 2, LOOKING, LOOKING, 2 > 2009-10-01 15:48:15,981 - INFO > [QuorumPeer:/0:0:0:0:0:0:0:0:2181:FastLeaderElection@642] - Adding vote > 2009-10-01 15:48:16,184 - WARN > [QuorumPeer:/0:0:0:0:0:0:0:0:2181:QuorumCnxManager@336] - Cannot open > channel to 1 at election address > hec-bp1.admin.nimblestorage.com/10.12.6.192:3888 > > > I can expect these kind of messages when the other server hasn't been > started, but even after a while keeps sending these messages. > > I can ping and ssh between the machines. > I noticed that just port 3888 is listening when I do netstat -an, why is > port 2888 not being used? > > Any ideas? > > Thanks > -h >