Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 41A1A17B34 for ; Wed, 11 Feb 2015 17:57:44 +0000 (UTC) Received: (qmail 29848 invoked by uid 500); 11 Feb 2015 17:57:43 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 29802 invoked by uid 500); 11 Feb 2015 17:57:43 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 29790 invoked by uid 99); 11 Feb 2015 17:57:43 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Feb 2015 17:57:43 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,NORMAL_HTTP_TO_IP,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of comptechgeeky@gmail.com designates 74.125.82.170 as permitted sender) Received: from [74.125.82.170] (HELO mail-we0-f170.google.com) (74.125.82.170) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 11 Feb 2015 17:57:17 +0000 Received: by mail-we0-f170.google.com with SMTP id q59so5103295wes.1 for ; Wed, 11 Feb 2015 09:56:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :content-type; bh=fELs2fbaetnZA+ljVU8VbZuMbqalXzR2A48wMB65sWs=; b=VIIlgyuXjR27LoyugYIZy2TlSwxuYpk/5OrxKFrHLJiS2xWokng680YEVgVVzh0eSb lWWqyqWvki8eyfnVO610i93Xj0FY1xjzI2vHqGoTxIvnrA99s09S9cS7HrYbrVFTo63a 0okjr1exfZBBOm1Fe1J7MpKDgaEvpbXt+/d3tU77I/TJuc4kogNiQ9zOpEX+zEgv0MtX NygvHyJlnpuUYmVwgXrBlUJJiGqMNMa0CQxWhuWxRymLs4fjTBWYa7EeTUr3e40D7GQl cn5dGrsG9Vgnw7HDX58oJRs4Lqs8h1xvOfodZK9pUrGw2f6CK8QR6JwB0O1LjC4SdjDY C59g== X-Received: by 10.180.205.142 with SMTP id lg14mr8122314wic.82.1423677391403; Wed, 11 Feb 2015 09:56:31 -0800 (PST) MIME-Version: 1.0 Received: by 10.217.149.193 with HTTP; Wed, 11 Feb 2015 09:56:10 -0800 (PST) In-Reply-To: References: From: Check Peck Date: Wed, 11 Feb 2015 09:56:10 -0800 Message-ID: Subject: Re: Cannot open channel to 2 at election address ERROR while starting zookeeper To: user Content-Type: multipart/alternative; boundary=001a11c3847e30820d050ed3b9cd X-Virus-Checked: Checked by ClamAV on apache.org --001a11c3847e30820d050ed3b9cd Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Can anyone help me on this? Has anyone seen these kind of issues? On Tue, Feb 10, 2015 at 4:26 PM, Check Peck wrote= : > I have also verified there is no firewall issue. Does anyone know what is > this error all about and how we can resolve this? > > On Tue, Feb 10, 2015 at 9:20 AM, Check Peck > wrote: > >> I am trying to setup 5 node zookeeper ensemble manage through Exhibitor. >> I have 5 machines and on each machine I will be running exhibitor and >> zookeeper. Below is my zoo.cfg file which is generated by exhibitor. >> >> #Auto-generated by Exhibitor - Mon Feb 09 10:18:35 GMT-07:00 2015 >> #Mon Feb 09 10:18:35 GMT-07:00 2015 >> server.3=3DmachineC.host.com\: >> 2888\:3888 >> server.2=3DmachineB.host.com\:2888\:3888 >> server.1=3DmachineA.host.com\:2888\:3888 >> initLimit=3D10 >> syncLimit=3D5 >> maxClientCnxns=3D21000 >> clientPort=3D2181 >> tickTime=3D2000 >> dataDir=3D/opt/zookeeper/data >> dataLogDir=3D/opt/zookeeper/data >> server.5=3DmachineD.host.com\:2888\:3888 >> server.4=3DmachineE.host.com\:2888\:3888 >> >> As soon as I am starting zookeeper through Exhibitor config pannel, I ca= n >> see all the five machines in my control panel but they all are yellow wh= ich >> means "ZooKeeper is running, but can=E2=80=99t communicate with the rest= of the >> ensemble" and in my Exhibitor logs, I am seeing these which has some ERR= OR >> in it. >> >> dev >> INFO com.netflix.exhibitor.core.activity.ActivityLog Exhibitor >> started [main] >> INFO org.mortbay.log Logging to >> org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via >> org.mortbay.log.Slf4jLog [main] >> INFO org.mortbay.log jetty-6.1.x [main] >> INFO org.mortbay.log Started SocketConnector@0.0.0.0:8080 [main] >> INFO com.netflix.exhibitor.core.activity.ActivityLog State: not >> serving [ActivityQueue-0] >> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >> down/not-serving waiting 30004 of 40000 ms before restarting >> [ActivityQueue-0] >> INFO com.netflix.exhibitor.core.activity.ActivityLog Restarting >> down/not-serving ZooKeeper after 60008 ms pause [ActivityQueue-0] >> INFO com.netflix.exhibitor.core.activity.ActivityLog Attempting to >> stop instance [ActivityQueue-0] >> INFO com.netflix.exhibitor.core.activity.ActivityLog Attempting to >> start/restart ZooKeeper [ActivityQueue-0] >> INFO com.netflix.exhibitor.core.activity.ActivityLog Kill attempte= d >> result: 0 [ActivityQueue-0] >> ERROR com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >> Server: JMX enabled by default [pool-2-thread-1] >> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >> Server: -Xmx2048m -Djava.net.preferIPv4Stack=3Dtrue [pool-2-thread-2] >> INFO com.netflix.exhibitor.core.activity.ActivityLog Process >> started via: /opt/zookeeper/zookeeper-3.4.6/bin/zkServer.sh >> [ActivityQueue-0] >> ERROR com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >> Server: Using config: /opt/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg >> [pool-2-thread-1] >> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >> Server: Starting zookeeper ... STARTED [pool-2-thread-2] >> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >> down/not-serving waiting 30005 of 40000 ms before restarting >> [ActivityQueue-0] >> INFO com.netflix.exhibitor.core.activity.ActivityLog Restarting >> down/not-serving ZooKeeper after 60008 ms pause [ActivityQueue-0] >> INFO com.netflix.exhibitor.core.activity.ActivityLog Attempting to >> stop instance [ActivityQueue-0] >> INFO com.netflix.exhibitor.core.activity.ActivityLog Attempting to >> start/restart ZooKeeper [ActivityQueue-0] >> INFO com.netflix.exhibitor.core.activity.ActivityLog Kill attempte= d >> result: 0 [ActivityQueue-0] >> INFO com.netflix.exhibitor.core.activity.ActivityLog Process >> started via: /opt/zookeeper/zookeeper-3.4.6/bin/zkServer.sh >> [ActivityQueue-0] >> ERROR com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >> Server: JMX enabled by default [pool-2-thread-1] >> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >> Server: -Xmx2048m -Djava.net.preferIPv4Stack=3Dtrue [pool-2-thread-2] >> ERROR com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >> Server: Using config: /opt/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg >> [pool-2-thread-1] >> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >> Server: Starting zookeeper ... STARTED [pool-2-thread-2] >> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >> down/not-serving waiting 30004 of 40000 ms before restarting >> [ActivityQueue-0] >> INFO com.netflix.exhibitor.core.activity.ActivityLog Restarting >> down/not-serving ZooKeeper after 60014 ms pause [ActivityQueue-0] >> INFO com.netflix.exhibitor.core.activity.ActivityLog Attempting to >> stop instance [ActivityQueue-0] >> INFO com.netflix.exhibitor.core.activity.ActivityLog Attempting to >> start/restart ZooKeeper [ActivityQueue-0] >> INFO com.netflix.exhibitor.core.activity.ActivityLog Kill attempte= d >> result: 0 [ActivityQueue-0] >> INFO com.netflix.exhibitor.core.activity.ActivityLog Process >> started via: /opt/zookeeper/zookeeper-3.4.6/bin/zkServer.sh >> [ActivityQueue-0] >> ERROR com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >> Server: JMX enabled by default [pool-2-thread-3] >> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >> Server: -Xmx2048m -Djava.net.preferIPv4Stack=3Dtrue [pool-2-thread-2] >> ERROR com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >> Server: Using config: /opt/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg >> [pool-2-thread-3] >> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >> Server: Starting zookeeper ... STARTED [pool-2-thread-2] >> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >> down/not-serving waiting 30005 of 40000 ms before restarting >> [ActivityQueue-0] >> INFO com.netflix.exhibitor.core.activity.ActivityLog Restarting >> down/not-serving ZooKeeper after 60008 ms pause [ActivityQueue-0] >> INFO com.netflix.exhibitor.core.activity.ActivityLog Attempting to >> stop instance [ActivityQueue-0] >> INFO com.netflix.exhibitor.core.activity.ActivityLog Attempting to >> start/restart ZooKeeper [ActivityQueue-0] >> INFO com.netflix.exhibitor.core.activity.ActivityLog Kill attempte= d >> result: 0 [ActivityQueue-0] >> INFO com.netflix.exhibitor.core.activity.ActivityLog Process >> started via: /opt/zookeeper/zookeeper-3.4.6/bin/zkServer.sh >> [ActivityQueue-0] >> ERROR com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >> Server: JMX enabled by default [pool-2-thread-2] >> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >> Server: -Xmx2048m -Djava.net.preferIPv4Stack=3Dtrue [pool-2-thread-3] >> ERROR com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >> Server: Using config: /opt/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg >> [pool-2-thread-2] >> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >> Server: Starting zookeeper ... STARTED [pool-2-thread-3] >> INFO com.netflix.exhibitor.core.activity.ActivityLog ZooKeeper >> down/not-serving waiting 30004 of 40000 ms before restarting >> [ActivityQueue-0] >> >> And in my zookeeper logs, I am seeing these - >> >> 2015-02-09 00:11:19,355 [myid:] - INFO [main:QuorumPeerConfig@103] >> - Reading configuration from: >> /opt/zookeeper/zookeeper-3.4.6/bin/../conf/zoo.cfg >> 2015-02-09 00:11:19,365 [myid:] - INFO [main:QuorumPeerConfig@340] >> - Defaulting to majority quorums >> 2015-02-09 00:11:19,368 [myid:1] - INFO >> [main:DatadirCleanupManager@78] - autopurge.snapRetainCount set to 3 >> 2015-02-09 00:11:19,368 [myid:1] - INFO >> [main:DatadirCleanupManager@79] - autopurge.purgeInterval set to 0 >> 2015-02-09 00:11:19,369 [myid:1] - INFO >> [main:DatadirCleanupManager@101] - Purge task is not scheduled. >> 2015-02-09 00:11:19,379 [myid:1] - INFO [main:QuorumPeerMain@127] - >> Starting quorum peer >> 2015-02-09 00:11:19,397 [myid:1] - INFO [main:NIOServerCnxnFactory@= 94] >> - binding to port 0.0.0.0/0.0.0.0:2181 >> 2015-02-09 00:11:19,414 [myid:1] - INFO [main:QuorumPeer@959] - >> tickTime set to 2000 >> 2015-02-09 00:11:19,414 [myid:1] - INFO [main:QuorumPeer@979] - >> minSessionTimeout set to -1 >> 2015-02-09 00:11:19,414 [myid:1] - INFO [main:QuorumPeer@990] - >> maxSessionTimeout set to -1 >> 2015-02-09 00:11:19,414 [myid:1] - INFO [main:QuorumPeer@1005] - >> initLimit set to 10 >> 2015-02-09 00:11:19,431 [myid:1] - INFO >> [Thread-1:QuorumCnxManager$Listener@504] - My election bind port: >> machineA.host.com/127.0.1.1:3888 >> 2015-02-09 00:11:19,440 [myid:1] - INFO >> [QuorumPeer[myid=3D1]/0.0.0.0:2181:QuorumPeer@714] - LOOKING >> 2015-02-09 00:11:19,441 [myid:1] - INFO >> [QuorumPeer[myid=3D1]/0.0.0.0:2181:FastLeaderElection@815] - New electio= n. >> My id =3D 1, proposed zxid=3D0x0 >> 2015-02-09 00:11:19,443 [myid:1] - INFO >> [WorkerReceiver[myid=3D1]:FastLeaderElection@597] - Notification: 1 >> (message format version), 1 (n.leader), 0x0 (n.zxid), 0x1 (n.round), >> LOOKING (n.state), 1 (n.sid), 0x0 (n.peerEpoch) LOOKING (my state) >> 2015-02-09 00:11:19,445 [myid:1] - WARN >> [WorkerSender[myid=3D1]:QuorumCnxManager@382] - Cannot open channel to 2 >> at election address machineB.host.com/10.52.81.211:3888 >> java.net.ConnectException: Connection refused >> at java.net.PlainSocketImpl.socketConnect(Native Method) >> at >> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:= 327) >> at >> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImp= l.java:193) >> at >> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:18= 0) >> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384= ) >> at java.net.Socket.connect(Socket.java:546) >> at >> org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnx= Manager.java:368) >> at >> org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxMana= ger.java:341) >> at >> org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSe= nder.process(FastLeaderElection.java:449) >> at >> org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSe= nder.run(FastLeaderElection.java:430) >> at java.lang.Thread.run(Thread.java:679) >> 2015-02-09 00:11:19,449 [myid:1] - WARN >> [WorkerSender[myid=3D1]:QuorumCnxManager@382] - Cannot open channel to 3 >> at election address machineC.host.com/10.57.78.941:3888 >> java.net.ConnectException: Connection refused >> at java.net.PlainSocketImpl.socketConnect(Native Method) >> at >> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:= 327) >> at >> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImp= l.java:193) >> at >> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:18= 0) >> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:384= ) >> at java.net.Socket.connect(Socket.java:546) >> at >> org.apache.zookeeper.server.quorum.QuorumCnxManager.connectOne(QuorumCnx= Manager.java:368) >> at >> org.apache.zookeeper.server.quorum.QuorumCnxManager.toSend(QuorumCnxMana= ger.java:341) >> at >> org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSe= nder.process(FastLeaderElection.java:449) >> at >> org.apache.zookeeper.server.quorum.FastLeaderElection$Messenger$WorkerSe= nder.run(FastLeaderElection.java:430) >> at java.lang.Thread.run(Thread.java:679) >> 2015-02-09 00:11:19,450 [myid:1] - WARN >> [WorkerSender[myid=3D1]:QuorumCnxManager@382] - Cannot open channel to 4 >> at election address machineD.host.com/10.59.576.12:3888 >> >> I am running Exhibitor 1.5.3 and Zookeeper 3.4.6. Is there anything wron= g >> I am doing? I have googled it for this ERROR and I was not able to find >> anything concrete. I have also verified that it is able to generate myid >> successfully in each machine. >> >> Is this known issue? I have seen other people also having same issue >> after I search on the google? >> > > --001a11c3847e30820d050ed3b9cd--