zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Preeti Bhat <preeti.b...@shoregrp.com>
Subject RE: Zookeeper fails to connect in cluster while using DNS
Date Fri, 28 Oct 2016 08:13:07 GMT
Hi Michael,

The client side log is as below. For Route S3, we have associated single IP to single DNS.

java.util.concurrent.TimeoutException: Could not connect to ZooKeeper xxx.xxx.xxxx.com:80,
xxx.xxx.xxxx.com:80 within 30000 ms
        at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:181)
        at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:115)
        at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:105)
        at org.apache.solr.cloud.ZkCLI.main(ZkCLI.java:188)
Caused by: java.util.concurrent.TimeoutException: Could not connect to ZooKeeper xxx.xxx.xxxx.com:80,
xxx.xxx.xxxx.com:80 within 30000 ms
        at org.apache.solr.common.cloud.ConnectionManager.waitForConnected(ConnectionManager.java:235)
        at org.apache.solr.common.cloud.SolrZkClient.<init>(SolrZkClient.java:173)
        ... 3 more

Thanks and Regards,
Preeti Bhat

-----Original Message-----
From: Michael Han [mailto:hanm@cloudera.com]
Sent: Thursday, October 27, 2016 9:51 AM
To: UserZooKeeper
Subject: Re: Zookeeper fails to connect in cluster while using DNS

This looks like server logs - since the problem is ZK client fail to connect to server, could
you also post client logs?

For route 53, if you associate multiple IP addresses to a single DNS name configured in ZK
ensemble and for some reasons one of the IP address does not have ZK server process running,
it could lead to client fail to connect, but I am not sure if that is your case.

On Wed, Oct 26, 2016 at 6:24 AM, Preeti Bhat <preeti.bhat@shoregrp.com>
wrote:

> Hi All,
>
> I am getting the below messages while trying to form the zookeeper
> cluster in zookeeper.out file. The zookeeper is setup in AWS EC2 RHEL
> linux servers. The configuration works when we are trying to Public
> DNS of AWS, but when trying to use the specific DNS created for these
> instances using
> RouteS3 we are getting the below error.
> I have tried stopping the servers, clearing out version-2 folder and
> restarting with no result.
> The DNS for the specific server is added to /etc/hosts file as well.
> Could someone please advise on this.
>
>
> 2016-10-26 09:03:09,991 [myid:] - INFO  [main:QuorumPeerConfig@103] -
> Reading configuration from: /root/zookeeper-3.4.8/bin/../conf/zoo.cfg
> 2016-10-26 09:03:10,054 [myid:] - INFO  [main:QuorumPeerConfig@331] -
> Defaulting to majority quorums
> 2016-10-26 09:03:10,057 [myid:2] - INFO
> [main:DatadirCleanupManager@78]
> - autopurge.snapRetainCount set to 3
> 2016-10-26 09:03:10,057 [myid:2] - INFO
> [main:DatadirCleanupManager@79]
> - autopurge.purgeInterval set to 0
> 2016-10-26 09:03:10,057 [myid:2] - INFO
> [main:DatadirCleanupManager@101]
> - Purge task is not scheduled.
> 2016-10-26 09:03:10,067 [myid:2] - INFO  [main:QuorumPeerMain@127] -
> Starting quorum peer
> 2016-10-26 09:03:10,078 [myid:2] - INFO
> [main:NIOServerCnxnFactory@89] - binding to port 0.0.0.0/0.0.0.0:80
> 2016-10-26 09:03:10,085 [myid:2] - INFO  [main:QuorumPeer@1019] -
> tickTime set to 40000
> 2016-10-26 09:03:10,085 [myid:2] - INFO  [main:QuorumPeer@1039] -
> minSessionTimeout set to 120000
> 2016-10-26 09:03:10,085 [myid:2] - INFO  [main:QuorumPeer@1050] -
> maxSessionTimeout set to 240000
> 2016-10-26 09:03:10,085 [myid:2] - INFO  [main:QuorumPeer@1065] -
> initLimit set to 10
> 2016-10-26 09:03:10,099 [myid:2] - INFO  [ListenerThread:
> QuorumCnxManager$Listener@534] - My election bind port:
> xxx.com/127.0.0.1:3888
> 2016-10-26 09:03:10,108 [myid:2] - INFO  [QuorumPeer[myid=2]/0:0:0:0:0:
> 0:0:0:80:QuorumPeer@774] - LOOKING
> 2016-10-26 09:03:10,109 [myid:2] - INFO  [QuorumPeer[myid=2]/0:0:0:0:0:
> 0:0:0:80:FastLeaderElection@818] - New election. My id =  2, proposed
> zxid=0x0
> 2016-10-26 09:03:10,116 [myid:2] - INFO  [WorkerReceiver[myid=2]:
> FastLeaderElection@600] - Notification: 1 (message format version), 2
> (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state), 2 (n.sid),
> 0x0
> (n.peerEpoch) LOOKING (my state)
> 2016-10-26 09:03:10,116 [myid:2] - INFO  [WorkerSender[myid=2]:
> QuorumCnxManager@199] - Have smaller server identifier, so dropping
> the
> connection: (3, 2)
> 2016-10-26 09:03:10,117 [myid:2] - WARN  [RecvWorker:1:
> QuorumCnxManager$RecvWorker@810] - Connection broken for id 1, my id =
> 2, error = java.io.EOFException
>         at java.io.DataInputStream.readInt(DataInputStream.java:392)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager$
> RecvWorker.run(QuorumCnxManager.java:795)
> 2016-10-26 09:03:10,117 [myid:2] - INFO  [WorkerSender[myid=2]:
> QuorumCnxManager@199] - Have smaller server identifier, so dropping
> the
> connection: (4, 2)
> 2016-10-26 09:03:10,118 [myid:2] - WARN  [RecvWorker:1:
> QuorumCnxManager$RecvWorker@813] - Interrupting SendWorker
> 2016-10-26 09:03:10,119 [myid:2] - WARN  [SendWorker:1:
> QuorumCnxManager$SendWorker@727] - Interrupted while waiting for
> message on queue java.lang.InterruptedException
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$
> ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.
> java:2014)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$
> ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
>         at java.util.concurrent.ArrayBlockingQueue.poll(
> ArrayBlockingQueue.java:418)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager.
> pollSendQueue(QuorumCnxManager.java:879)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager.
> access$500(QuorumCnxManager.java:65)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager$
> SendWorker.run(QuorumCnxManager.java:715)
> 2016-10-26 09:03:10,119 [myid:2] - WARN  [SendWorker:1:
> QuorumCnxManager$SendWorker@736] - Send worker leaving thread
> 2016-10-26 09:03:10,120 [myid:2] - INFO  [WorkerSender[myid=2]:
> QuorumCnxManager@199] - Have smaller server identifier, so dropping
> the
> connection: (5, 2)
> 2016-10-26 09:03:10,318 [myid:2] - INFO  [QuorumPeer[myid=2]/0:0:0:0:0:
> 0:0:0:80:FastLeaderElection@852] - Notification time out: 400
> 2016-10-26 09:03:10,320 [myid:2] - INFO  [WorkerReceiver[myid=2]:
> FastLeaderElection@600] - Notification: 1 (message format version), 2
> (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state), 2 (n.sid),
> 0x0
> (n.peerEpoch) LOOKING (my state)
> 2016-10-26 09:03:10,321 [myid:2] - INFO  [WorkerSender[myid=2]:
> QuorumCnxManager@199] - Have smaller server identifier, so dropping
> the
> connection: (3, 2)
> 2016-10-26 09:03:10,321 [myid:2] - WARN  [RecvWorker:1:
> QuorumCnxManager$RecvWorker@810] - Connection broken for id 1, my id =
> 2, error = java.io.EOFException
>         at java.io.DataInputStream.readInt(DataInputStream.java:392)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager$
> RecvWorker.run(QuorumCnxManager.java:795)
> 2016-10-26 09:03:10,321 [myid:2] - WARN  [RecvWorker:1:
> QuorumCnxManager$RecvWorker@813] - Interrupting SendWorker
> 2016-10-26 09:03:10,321 [myid:2] - WARN  [SendWorker:1:
> QuorumCnxManager$SendWorker@727] - Interrupted while waiting for
> message on queue java.lang.InterruptedException
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$
> ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.
> java:2014)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$
> ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
>         at java.util.concurrent.ArrayBlockingQueue.poll(
> ArrayBlockingQueue.java:418)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager.
> pollSendQueue(QuorumCnxManager.java:879)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager.
> access$500(QuorumCnxManager.java:65)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager$
> SendWorker.run(QuorumCnxManager.java:715)
> 2016-10-26 09:03:10,321 [myid:2] - WARN  [SendWorker:1:
> QuorumCnxManager$SendWorker@736] - Send worker leaving thread
> 2016-10-26 09:03:10,322 [myid:2] - INFO  [WorkerSender[myid=2]:
> QuorumCnxManager@199] - Have smaller server identifier, so dropping
> the
> connection: (4, 2)
> 2016-10-26 09:03:10,322 [myid:2] - INFO  [WorkerSender[myid=2]:
> QuorumCnxManager@199] - Have smaller server identifier, so dropping
> the
> connection: (5, 2)
> 2016-10-26 09:03:10,720 [myid:2] - INFO  [QuorumPeer[myid=2]/0:0:0:0:0:
> 0:0:0:80:FastLeaderElection@852] - Notification time out: 800
> 2016-10-26 09:03:10,722 [myid:2] - INFO  [WorkerReceiver[myid=2]:
> FastLeaderElection@600] - Notification: 1 (message format version), 2
> (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state), 2 (n.sid),
> 0x0
> (n.peerEpoch) LOOKING (my state)
> 2016-10-26 09:03:10,722 [myid:2] - WARN  [RecvWorker:1:
> QuorumCnxManager$RecvWorker@810] - Connection broken for id 1, my id =
> 2, error = java.io.EOFException
>         at java.io.DataInputStream.readInt(DataInputStream.java:392)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager$
> RecvWorker.run(QuorumCnxManager.java:795)
> 2016-10-26 09:03:10,723 [myid:2] - WARN  [RecvWorker:1:
> QuorumCnxManager$RecvWorker@813] - Interrupting SendWorker
> 2016-10-26 09:03:10,722 [myid:2] - INFO  [WorkerSender[myid=2]:
> QuorumCnxManager@199] - Have smaller server identifier, so dropping
> the
> connection: (3, 2)
> 2016-10-26 09:03:10,723 [myid:2] - WARN  [SendWorker:1:
> QuorumCnxManager$SendWorker@727] - Interrupted while waiting for
> message on queue java.lang.InterruptedException
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$
> ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.
> java:2014)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$
> ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
>         at java.util.concurrent.ArrayBlockingQueue.poll(
> ArrayBlockingQueue.java:418)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager.
> pollSendQueue(QuorumCnxManager.java:879)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager.
> access$500(QuorumCnxManager.java:65)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager$
> SendWorker.run(QuorumCnxManager.java:715)
> 2016-10-26 09:03:10,723 [myid:2] - WARN  [SendWorker:1:
> QuorumCnxManager$SendWorker@736] - Send worker leaving thread
> 2016-10-26 09:03:10,724 [myid:2] - INFO  [WorkerSender[myid=2]:
> QuorumCnxManager@199] - Have smaller server identifier, so dropping
> the
> connection: (4, 2)
> 2016-10-26 09:03:10,724 [myid:2] - INFO  [WorkerSender[myid=2]:
> QuorumCnxManager@199] - Have smaller server identifier, so dropping
> the
> connection: (5, 2)
> 2016-10-26 09:03:11,522 [myid:2] - INFO  [QuorumPeer[myid=2]/0:0:0:0:0:
> 0:0:0:80:FastLeaderElection@852] - Notification time out: 1600
> 2016-10-26 09:03:11,524 [myid:2] - INFO  [WorkerReceiver[myid=2]:
> FastLeaderElection@600] - Notification: 1 (message format version), 2
> (n.leader), 0x0 (n.zxid), 0x1 (n.round), LOOKING (n.state), 2 (n.sid),
> 0x0
> (n.peerEpoch) LOOKING (my state)
> 2016-10-26 09:03:11,524 [myid:2] - WARN  [RecvWorker:1:
> QuorumCnxManager$RecvWorker@810] - Connection broken for id 1, my id =
> 2, error = java.io.EOFException
>         at java.io.DataInputStream.readInt(DataInputStream.java:392)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager$
> RecvWorker.run(QuorumCnxManager.java:795)
> 2016-10-26 09:03:11,525 [myid:2] - WARN  [RecvWorker:1:
> QuorumCnxManager$RecvWorker@813] - Interrupting SendWorker
> 2016-10-26 09:03:11,525 [myid:2] - INFO  [WorkerSender[myid=2]:
> QuorumCnxManager@199] - Have smaller server identifier, so dropping
> the
> connection: (3, 2)
> 2016-10-26 09:03:11,525 [myid:2] - WARN  [SendWorker:1:
> QuorumCnxManager$SendWorker@727] - Interrupted while waiting for
> message on queue java.lang.InterruptedException
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$
> ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.
> java:2014)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$
> ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2088)
>         at java.util.concurrent.ArrayBlockingQueue.poll(
> ArrayBlockingQueue.java:418)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager.
> pollSendQueue(QuorumCnxManager.java:879)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager.
> access$500(QuorumCnxManager.java:65)
>         at org.apache.zookeeper.server.quorum.QuorumCnxManager$
> SendWorker.run(QuorumCnxManager.java:715)
> 2016-10-26 09:03:11,525 [myid:2] - WARN  [SendWorker:1:
> QuorumCnxManager$SendWorker@736] - Send worker leaving thread
> 2016-10-26 09:03:11,526 [myid:2] - INFO  [WorkerSender[myid=2]:
> QuorumCnxManager@199] - Have smaller server identifier, so dropping
> the
> connection: (4, 2)
> 2016-10-26 09:03:11,527 [myid:2] - INFO  [WorkerSender[myid=2]:
> QuorumCnxManager@199] - Have smaller server identifier, so dropping
> the
> connection: (5, 2)
> 2016-10-26 09:03:11,555 [myid:2] - INFO
> [NIOServerCxn.Factory:0.0.0.0/ 0.0.0.0:80:NIOServerCnxnFactory@192] -
> Accepted socket connection from /
> 172.31.18.151:36082
> 2016-10-26 09:03:11,609 [myid:2] - WARN
> [NIOServerCxn.Factory:0.0.0.0/ 0.0.0.0:80:NIOServerCnxn@357] - caught
> end of stream exception
> EndOfStreamException: Unable to read additional data from client
> sessionid 0x0, likely client has closed socket
>         at org.apache.zookeeper.server.NIOServerCnxn.doIO(
> NIOServerCnxn.java:230)
>         at org.apache.zookeeper.server.NIOServerCnxnFactory.run(
> NIOServerCnxnFactory.java:203)
>         at java.lang.Thread.run(Thread.java:745)
>
>
> Thanks and Regards,
> Preeti Bhat
>
>
>
> NOTICE TO RECIPIENTS: This communication may contain confidential
> and/or privileged information. If you are not the intended recipient
> (or have received this communication in error) please notify the
> sender and it-support@shoregrp.com immediately, and destroy this
> communication. Any unauthorized copying, disclosure or distribution of
> the material in this communication is strictly forbidden. Any views or
> opinions presented in this email are solely those of the author and do
> not necessarily represent those of the company. Finally, the recipient
> should check this email and any attachments for the presence of
> viruses. The company accepts no liability for any damage caused by any virus transmitted
by this email.
>
>
>


--
Cheers
Michael.

NOTICE TO RECIPIENTS: This communication may contain confidential and/or privileged information.
If you are not the intended recipient (or have received this communication in error) please
notify the sender and it-support@shoregrp.com immediately, and destroy this communication.
Any unauthorized copying, disclosure or distribution of the material in this communication
is strictly forbidden. Any views or opinions presented in this email are solely those of the
author and do not necessarily represent those of the company. Finally, the recipient should
check this email and any attachments for the presence of viruses. The company accepts no liability
for any damage caused by any virus transmitted by this email.


Mime
View raw message