Thanks Chris. But it happens even if I just take down the ZK process. IP
address seem to remain the same.
On Wed, Sep 9, 2015 at 11:23 PM, Chris Nauroth <cnauroth@hortonworks.com>
wrote:
> Hello Bill,
>
> When the VMs restart, is it possible that they are assigned different IP
> addresses, despite retaining their original hostnames?
>
> The reason I ask is that we currently have a known issue in that a running
> ZooKeeper server will not redo DNS resolution for previously encountered
> hostnames in the ensemble. This is documented in issue ZOOKEEPER-1506,
> where a proposed patch is undergoing review and testing.
>
> https://issues.apache.org/jira/browse/ZOOKEEPER-1506
>
>
> If IP addresses are changing after VM restarts in your environment, then
> it seems plausible that you're seeing the symptoms of ZOOKEEPER-1506.
>
> --Chris Nauroth
>
>
>
>
> On 9/9/15, 11:09 PM, "Bill Hastings" <bllhastings@gmail.com> wrote:
>
> >On the node, which is not the leader I get the following messages in the
> >log:
> >
> >04:26:43,076 WARN QuorumCnxManager:382 - Cannot open channel to 2 at
> >election address hvs2.dwa.local/192.168.8.11:4000
> >04:26:43,089 WARN QuorumCnxManager:382 - Cannot open channel to 2 at
> >election address hvs2.dwa.local/192.168.8.11:4000
> >06:35:25,844 INFO QuorumCnxManager:511 - Received connection request /
> >192.168.8.11:51367
> >06:38:00,399 INFO QuorumCnxManager:511 - Received connection request /
> >192.168.8.11:51539
> >07:18:27,940 INFO QuorumCnxManager:511 - Received connection request /
> >192.168.8.11:52720
> >07:33:58,042 INFO LeaderElection:187 - Server address: hvs2.dwa.local/
> >192.168.8.11:3000
> >07:33:59,449 INFO LeaderElection:187 - Server address: hvs2.dwa.local/
> >192.168.8.11:3000
> >07:34:00,854 INFO LeaderElection:187 - Server address: hvs2.dwa.local/
> >192.168.8.11:3000
> >07:34:02,257 INFO LeaderElection:187 - Server address: hvs2.dwa.local/
> >192.168.8.11:3000
> >07:34:03,660 INFO LeaderElection:187 - Server address: hvs2.dwa.local/
> >192.168.8.11:3000
> >07:34:05,063 INFO LeaderElection:187 - Server address: hvs2.dwa.local/
> >192.168.8.11:3000
> >07:34:06,266 INFO LeaderElection:187 - Server address: hvs2.dwa.local/
> >192.168.8.11:3000
> >07:34:06,585 WARN Learner:234 - Unexpected exception, tries=0, connecting
> >to hvs2.dwa.local/192.168.8.11:3000
> >07:55:28,865 WARN QuorumCnxManager:382 - Cannot open channel to 2 at
> >election address hvs2.dwa.local/192.168.8.11:4000
> >07:55:29,066 WARN QuorumCnxManager:382 - Cannot open channel to 2 at
> >election address hvs2.dwa.local/192.168.8.11:4000
> >07:55:29,471 WARN QuorumCnxManager:382 - Cannot open channel to 2 at
> >election address hvs2.dwa.local/192.168.8.11:4000
> >07:55:30,275 WARN QuorumCnxManager:382 - Cannot open channel to 2 at
> >election address hvs2.dwa.local/192.168.8.11:4000
> >07:55:31,878 WARN QuorumCnxManager:382 - Cannot open channel to 2 at
> >election address hvs2.dwa.local/192.168.8.11:4000
> >07:55:34,106 INFO QuorumCnxManager:511 - Received connection request /
> >192.168.8.11:55863
> >07:58:01,872 INFO QuorumCnxManager:511 - Received connection request /
> >192.168.8.11:56662
> >
> >On the leader I get the following:
> >
> >4:19:50,815 WARN QuorumCnxManager:382 - Cannot open channel to 1 at
> >election address hvs1.dwa.local/192.168.8.10:4000
> >04:20:46,903 INFO QuorumCnxManager:511 - Received connection request /
> >192.168.8.10:46459
> >06:36:04,561 WARN QuorumCnxManager:382 - Cannot open channel to 1 at
> >election address hvs1.dwa.local/192.168.8.10:4000
> >06:36:04,771 WARN QuorumCnxManager:382 - Cannot open channel to 1 at
> >election address hvs1.dwa.local/192.168.8.10:4000
> >06:36:05,175 WARN QuorumCnxManager:382 - Cannot open channel to 1 at
> >election address hvs1.dwa.local/192.168.8.10:4000
> >06:36:05,980 WARN QuorumCnxManager:382 - Cannot open channel to 1 at
> >election address hvs1.dwa.local/192.168.8.10:4000
> >06:36:07,585 WARN QuorumCnxManager:382 - Cannot open channel to 1 at
> >election address hvs1.dwa.local/192.168.8.10:4000
> >06:36:10,789 WARN QuorumCnxManager:382 - Cannot open channel to 1 at
> >election address hvs1.dwa.local/192.168.8.10:4000
> >06:36:17,194 WARN QuorumCnxManager:382 - Cannot open channel to 1 at
> >election address hvs1.dwa.local/192.168.8.10:4000
> >06:36:29,999 WARN QuorumCnxManager:382 - Cannot open channel to 1 at
> >election address hvs1.dwa.local/192.168.8.10:4000
> >06:36:53,578 INFO QuorumCnxManager:511 - Received connection request /
> >192.168.8.10:50285
> >07:16:53,244 WARN LearnerHandler:646 - ******* GOODBYE
> >/192.168.8.10:42097
> >********
> >07:17:21,117 INFO QuorumCnxManager:511 - Received connection request /
> >192.168.8.10:51044
> >07:32:57,213 INFO LeaderElection:187 - Server address: hvs1.dwa.local/
> >192.168.8.10:3000
> >07:32:58,427 INFO LeaderElection:187 - Server address: hvs1.dwa.local/
> >192.168.8.10:3000
> >07:32:59,631 INFO LeaderElection:187 - Server address: hvs1.dwa.local/
> >192.168.8.10:3000
> >07:34:00,575 WARN LearnerHandler:646 - ******* GOODBYE
> >/192.168.8.10:43186
> >********
> >07:56:11,493 WARN LearnerHandler:646 - ******* GOODBYE
> >/192.168.8.10:43536
> >********
> >07:56:55,045 INFO QuorumCnxManager:511 - Received connection request /
> >192.168.8.10:51949
> >
> >On Wed, Sep 9, 2015 at 10:42 PM, Bill Hastings <bllhastings@gmail.com>
> >wrote:
> >
> >> Hi All
> >>
> >> I am running ZK as a 3 node cluster. Each ZK instance is a VMWare VM in
> >>a
> >> distinct ESX host. Let's assume the three VMs are A, B and C where A is
> >>the
> >> leader. Now if I take down VM B and C and then bring one of them back
> >>up.
> >> However the ZK cluster is never formed unless I bounce VM A. How can I
> >> troubleshoot this? This however does not happen in a physical
> >>environment.
> >>
> >> --
> >> Cheers
> >> Bill
> >>
> >
> >
> >
> >--
> >Cheers
> >Bill
>
>
--
Cheers
Bill
|