hadoop-zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mahadev Konar <maha...@yahoo-inc.com>
Subject Re: possible bug in zookeeper ?
Date Wed, 15 Sep 2010 16:49:15 GMT
Yatir,
 

  Can you try this out:
 From zook1, try running the zookeeper a simpole client library:

http://hadoop.apache.org/zookeeper/docs/r3.2.1/zookeeperStarted.html

bin/zkCli.sh -server zoo1:port,zoo2:port,zoo3:port

And then try killing one of the servers and see if this client connects to
the other servers.


You can try this out for differetn zoookeeper versions by using different
zookeeper jar releases. This way we can find out if anything is wrong with
the release you are using or its a problem in general you are seeeing.

Thanks
mahadev

On 9/15/10 12:56 AM, "Yatir Ben Shlomo" <yatirb@outbrain.com> wrote:

> Thanks to all who replied, I appreciate your efforts:
> 
> 1. There is no connections problem from the client machine:
> (ob1078)(tomcat@cass3:~)$ echo ruok | nc zook1 2181
> imok(ob1078)(tomcat@cass3:~)$ echo ruok | nc zook2 2181
> imok(ob1078)(tomcat@cass3:~)$ echo ruok | nc zook3 2181
> imok(ob1078)(tomcat@cass3:~)$
> 
> 2. Unfortunately I have already tried to switch to the new jar but it does not
> seem to be backward compatible.
> It seems that the QuorumPeerConfig class does not have the following field
> protected int clientPort;
> It was replaced by InetSocketAddress clientPortAddress in the new jar
> So I am getting java.lang.NoSuchFieldError exception...
> 
> 3. I looked at the ClientCnxn.java code.
> It seems that the logic for iterating over the available servers
> (nextAddrToTry++ ) is used only inside the startConnect() function but not in
> the finishConnect() function, nor anywhere else.
> 
> Possibly something along these lines is happening:
> some exception that happens inside the finishConnect() function is cauasing
> the cleanup() function which in turn causes another exception.
> Nowhere in this code path is the nextAddrToTry++ applied.
> Can this make sense to someone ?
> thanks
> 
> 
> 
> 
> 
> 
> -----Original Message-----
> From: Patrick Hunt [mailto:phunt@apache.org]
> Sent: Tuesday, September 14, 2010 6:20 PM
> To: zookeeper-user@hadoop.apache.org
> Subject: Re: possible bug in zookeeper ?
> 
> That is unusual. I don't recall anyone reporting a similar issue, and
> looking at the code I don't see any issues off hand. Can you try the
> following?
> 
> 1) on that particular zk client machine resolve the hosts zook1/zook2/zook3,
> what ip addresses does this resolve to? (try dig)
> 2) try running the client using the 3.3.1 jar file (just replace the jar on
> the client), it includes more log4j information, turn on DEBUG or TRACE
> logging
> 
> Patrick
> 
> On Tue, Sep 14, 2010 at 8:44 AM, Yatir Ben Shlomo <yatirb@outbrain.com>wrote:
> 
>> zook1:2181,zook2:2181,zook3:2181
>> 
>> 
>> -----Original Message-----
>> From: Ted Dunning [mailto:ted.dunning@gmail.com]
>> Sent: Tuesday, September 14, 2010 4:11 PM
>> To: zookeeper-user@hadoop.apache.org
>> Subject: Re: possible bug in zookeeper ?
>> 
>> What was the list of servers that was given originally to open the
>> connection to ZK?
>> 
>> On Tue, Sep 14, 2010 at 6:15 AM, Yatir Ben Shlomo <yatirb@outbrain.com
>>> wrote:
>> 
>>> Hi I am using solrCloud which uses an ensemble of 3 zookeeper instances.
>>> 
>>> I am performing survivability  tests:
>>> Taking one of the zookeeper instances down I would expect the client to
>> use
>>> a different zookeeper server instance.
>>> 
>>> But as you can see in the below logs attached
>>> Depending on which instance I choose to take down (in my case,  the last
>>> one in the list of zookeeper servers)
>>> the client is constantly insisting on the same zookeeper server
>> (Attempting
>>> connection to server zook3/192.168.252.78:2181)
>>> and not switching to a different one
>>> the problem seems to arrive from ClientCnxn.java
>>> Any one has an idea on this ?
>>> 
>>> Solr cloud currently is using  zookeeper-3.2.2.jar
>>> Is this a know bug that was fixed in later versions ?( 3.3.1)
>>> 
>>> Thanks in advance,
>>> Yatir
>>> 
>>> 
>>> Logs:
>>> 
>>> Sep 14, 2010 9:02:20 AM org.apache.log4j.Category warn
>>> WARNING: Ignoring exception during shutdown input
>>> java.nio.channels.ClosedChannelException
>>>        at
>>> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638)
>>>        at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
>>>        at
>>> 
>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(zookeeper:ClientCnxn.java)
>> :999)
>>>        at
>>> 
>> 
org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):970>>
)
>>> Sep 14, 2010 9:02:20 AM org.apache.log4j.Category warn
>>> WARNING: Ignoring exception during shutdown output
>>> java.nio.channels.ClosedChannelException
>>>        at
>>> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649)
>>>        at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>>>        at
>>> 
>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(zookeeper:ClientCnxn.java)
>> :1004)
>>>        at
>>> 
>> 
org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):970>>
)
>>> Sep 14, 2010 9:02:22 AM org.apache.log4j.Category info
>>> INFO: Attempting connection to server zook3/192.168.252.78:2181
>>> Sep 14, 2010 9:02:22 AM org.apache.log4j.Category warn
>>> WARNING: Exception closing session 0x32b105244a20001 to
>>> sun.nio.ch.SelectionKeyImpl@3ca58cbf
>>> java.net.ConnectException: Connection refused
>>>        at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
>>>        at
>> sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
>>>        at
>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>        at
>>> 
>> 
org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):933>>
)
>>> Sep 14, 2010 9:02:22 AM org.apache.log4j.Category warn
>>> WARNING: Ignoring exception during shutdown input
>>> java.nio.channels.ClosedChannelException
>>>        at
>>> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638)
>>>        at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
>>>        at
>>> 
>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(zookeeper:ClientCnxn.java)
>> :999)
>>>        at
>>> 
>> 
org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):970>>
)
>>> Sep 14, 2010 9:02:22 AM org.apache.log4j.Category warn
>>> WARNING: Ignoring exception during shutdown output
>>> java.nio.channels.ClosedChannelException
>>>        at
>>> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649)
>>>        at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>>>        at
>>> 
>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(zookeeper:ClientCnxn.java)
>> :1004)
>>>        at
>>> 
>> 
org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):970>>
)
>>> Sep 14, 2010 9:02:22 AM org.apache.log4j.Category info
>>> INFO: Attempting connection to server zook3/192.168.252.78:2181
>>> Sep 14, 2010 9:02:22 AM org.apache.log4j.Category warn
>>> WARNING: Exception closing session 0x32b105244a20000 to
>>> sun.nio.ch.SelectionKeyImpl@3960f81b
>>> java.net.ConnectException: Connection refused
>>>        at sun.nio.ch.SocketChannelImpl.$$YJP$$checkConnect(Native Method)
>>>        at
>> sun.nio.ch.SocketChannelImpl.checkConnect(SocketChannelImpl.java)
>>>        at
>>> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>>>        at
>>> 
>> 
org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):933>>
)
>>> Sep 14, 2010 9:02:22 AM org.apache.log4j.Category warn
>>> WARNING: Ignoring exception during shutdown input
>>> java.nio.channels.ClosedChannelException
>>>        at
>>> sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:638)
>>>        at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:360)
>>>        at
>>> 
>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(zookeeper:ClientCnxn.java)
>> :999)
>>>        at
>>> 
>> 
org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):970>>
)
>>> Sep 14, 2010 9:02:22 AM org.apache.log4j.Category warn
>>> WARNING: Ignoring exception during shutdown output
>>> java.nio.channels.ClosedChannelException
>>>        at
>>> sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:649)
>>>        at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:368)
>>>        at
>>> 
>> org.apache.zookeeper.ClientCnxn$SendThread.cleanup(zookeeper:ClientCnxn.java)
>> :1004)
>>>        at
>>> 
>> 
org.apache.zookeeper.ClientCnxn$SendThread.run(zookeeper:ClientCnxn.java):970>>
)
>>> 
>>> 
>> 
> 


Mime
View raw message