helix-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Zhen Zhang <nehzgn...@gmail.com>
Subject Re: ZooKeeper disconnects on controller
Date Sat, 02 May 2015 20:39:31 GMT
you may also check zookeeper log to see if there is any error/exception
messages

On Sat, May 2, 2015 at 1:08 PM, kishore g <g.kishore@gmail.com> wrote:

> Is zookeeper quorum working fine?. Can you run each stat| nc zkhost zkPort
> for each zk server and paste the output.
>  On May 2, 2015 1:02 PM, "Varun Sharma" <varun@pinterest.com> wrote:
>
>> We are also seeing that all our machines (participants and controller)
>> are connecting to the same zookeeper machine which is rather weird - it
>> also makes it hard to scale up traffic via observers. Is the following the
>> right way to pass the zookeeper string (with comma separation):
>>
>> zk001:2181, zk002:2181,zk003:2181
>>
>> Thanks
>> Varun
>>
>> On Fri, May 1, 2015 at 3:32 PM, Varun Sharma <varun@pinterest.com> wrote:
>>
>>> Hi,
>>>
>>> We are seeing zookeeper disconnects on the controller and the controller
>>> gets into a state from which it cannot reconnect back. We see messages like
>>> the ones below over and over again. It keeps trying to re-establish
>>> connections against the same session ID and keeps failing. On the other
>>> hand, the participants see one hiccup while in their zookeeper connection
>>> but gracefully reconnect back. What would cause the controller to keep
>>> retrying but failing to connect even after the zookeeper comes back to a
>>> healthy state ?
>>>
>>> 2015-05-01 20:47:02,865 [main-SendThread(terrapinzk001a:2181)]
>>> (ClientCnxn.java:1061) INFO  Opening socket connection to server
>>> terrapinzk001a/10.115.59.31:2181
>>>
>>> 2015-05-01 20:47:02,866 [main-SendThread(terrapinzk001a:2181)]
>>> (ClientCnxn.java:950) INFO  Socket connection established to terrapinzk001a/
>>> 10.115.59.31:2181, initiating session
>>>
>>> 2015-05-01 20:47:02,880 [main-SendThread(terrapinzk001a:2181)]
>>> (ClientCnxn.java:739) INFO  Session establishment complete on server
>>> terrapinzk001a/10.115.59.31:2181, sessionid = 0x14d111892390023,
>>> negotiated timeout = 30000
>>>
>>> 2015-05-01 20:47:02,884 [main-EventThread] (ZkClient.java:449) INFO
>>> zookeeper state changed (SyncConnected)
>>>
>>> 2015-05-01 20:47:02,884 [main-SendThread(terrapinzk001a:2181)]
>>> (ClientCnxn.java:1186) INFO  Unable to read additional data from server
>>> sessionid 0x14d111892390023, likely server has closed socket, closing
>>> socket connection and attempting reconnect
>>>
>>> 2015-05-01 20:47:02,988 [main-EventThread] (ZkClient.java:449) INFO
>>> zookeeper state changed (Disconnected)
>>>
>>
>>

Mime
View raw message