zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@apache.org>
Subject Re: permanent ZSESSIONMOVED
Date Tue, 16 Mar 2010 16:18:44 GMT
Can you verify that you are using 3.2.2 on all servers? You can do this 
by running the "stat" command against each of your servers and look at 
the very top of the output (we include the version of zk server there).
http://bit.ly/dglVld

Are you using synchronous or async operations in your client?

I see snippets from the server logs, but I don't see anything from you 
client logs. It would really help if you could zip up all the logs 
(server and client) and provide them.

Patrick

Łukasz Osipiuk wrote:
> not really - it happens occasionally - every few days :(
> I believe it is somewhat connected with our network environment which
> suffers from some packet loss which leads to
> connection timeouts.
> 
> I can switch on some more logging if you can lead me which categories
> are worth to enable DEBUG for them.
> 
> Regards, Łukasz Osipiuk
> 
> On Tue, Mar 16, 2010 at 16:35, Benjamin Reed <breed@yahoo-inc.com> wrote:
>> weird, this does sound like a bug. do you have a reliable way of reproducing
>> the problem?
>>
>> thanx
>> ben
>>
>> On 03/16/2010 08:27 AM, Łukasz Osipiuk wrote:
>>> nope.
>>>
>>> I always pass 0 as clientid.
>>>
>>> Łukasz
>>>
>>> On Tue, Mar 16, 2010 at 16:20, Benjamin Reed<breed@yahoo-inc.com>  wrote:
>>>
>>>> do you ever use zookeeper_init() with the clientid field set to something
>>>> other than null?
>>>>
>>>> ben
>>>>
>>>> On 03/16/2010 07:43 AM, Łukasz Osipiuk wrote:
>>>>
>>>>> Hi everyone!
>>>>>
>>>>> I am writing to this group because recently we are getting some
>>>>> strange errors with our production zookeeper setup.
>>>>>
>>>>>  From time to time we are observing that our client application (C++
>>>>> based) disconnects from zookeeper (session state is changed to 1) and
>>>>> reconnects (state changed to 3).
>>>>> This itself is not a problem - usually application continues to run
>>>>> without problems after reconnect.
>>>>> But from time to time after above happens all subsequent operations
>>>>> start to return ZSESSIONMOVED error. To make it work again we have to
>>>>> restart application (which creates new zookeeper session).
>>>>>
>>>>> I noticed that in 3.2.0 introduced a bug
>>>>> http://issues.apache.org/jira/browse/ZOOKEEPER-449 but we are using
>>>>> zookeeper v. 3.2.2.
>>>>> I just noticed that app at compile time used 3.2.0 library but patches
>>>>> fixing bug 449 did not touch C client lib so I believe that our
>>>>> problems are not
>>>>> related with that.
>>>>>
>>>>> In zookeeper logs at moment which initiated the problem with client
>>>>> application I have
>>>>>
>>>>> node1:
>>>>> 2010-03-16 14:21:43,510 - INFO
>>>>> [NIOServerCxn.Factory:2181:NIOServerCnxn@607] - Connected to
>>>>> /10.1.112.61:37197 lastZxid 42992576502
>>>>> 2010-03-16 14:21:43,510 - INFO
>>>>> [NIOServerCxn.Factory:2181:NIOServerCnxn@636] - Renewing session
>>>>> 0x324dcc1ba580085
>>>>> 2010-03-16 14:21:49,443 - INFO
>>>>> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:NIOServerCnxn@992] - Finished init
>>>>> of 0x324dcc1ba580085 valid:true
>>>>> 2010-03-16 14:21:49,443 - WARN
>>>>> [NIOServerCxn.Factory:2181:NIOServerCnxn@518] - Exception causing
>>>>> close of session 0x324dcc1ba580085 due to java.io.IOException: Read
>>>>> error
>>>>> 2010-03-16 14:21:49,444 - INFO
>>>>> [NIOServerCxn.Factory:2181:NIOServerCnxn@857] - closing
>>>>> session:0x324dcc1ba580085 NIOServerCnxn:
>>>>> java.nio.channels.SocketChannel[connected local=/10.1.112.62:2181
>>>>> remote=/10.1.112.61:37197]
>>>>>
>>>>> node2:
>>>>> 2010-03-16 14:21:40,580 - WARN
>>>>> [NIOServerCxn.Factory:2181:NIOServerCnxn@494] - Exception causing
>>>>> close of session 0x324dcc1ba580085 due to java.io.IOException: Read
>>>>> error
>>>>> 2010-03-16 14:21:40,581 - INFO
>>>>> [NIOServerCxn.Factory:2181:NIOServerCnxn@833] - closing
>>>>> session:0x324dcc1ba580085 NIOServerCnxn:
>>>>> java.nio.channels.SocketChannel[connected local=/10.1.112.63:2181
>>>>> remote=/10.1.112.61:60693]
>>>>> 2010-03-16 14:21:46,839 - INFO
>>>>> [NIOServerCxn.Factory:2181:NIOServerCnxn@583] - Connected to
>>>>> /10.1.112.61:48336 lastZxid 42992576502
>>>>> 2010-03-16 14:21:46,839 - INFO
>>>>> [NIOServerCxn.Factory:2181:NIOServerCnxn@612] - Renewing session
>>>>> 0x324dcc1ba580085
>>>>> 2010-03-16 14:21:49,439 - INFO
>>>>> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:NIOServerCnxn@964] - Finished init
>>>>> of 0x324dcc1ba580085 valid:true
>>>>>
>>>>> node3:
>>>>> 2010-03-16 02:14:48,961 - WARN
>>>>> [NIOServerCxn.Factory:2181:NIOServerCnxn@494] - Exception causing
>>>>> close of session 0x324dcc1ba580085 due to java.io.IOException: Read
>>>>> error
>>>>> 2010-03-16 02:14:48,962 - INFO
>>>>> [NIOServerCxn.Factory:2181:NIOServerCnxn@833] - closing
>>>>> session:0x324dcc1ba580085 NIOServerCnxn:
>>>>> java.nio.channels.SocketChannel[connected local=/10.1.112.64:2181
>>>>> remote=/10.1.112.61:57309]
>>>>>
>>>>> and then lots of entries like this
>>>>> 2010-03-16 02:14:54,696 - WARN
>>>>> [ProcessThread:-1:PrepRequestProcessor@402] - Got exception when
>>>>> processing sessionid:0x324dcc1ba580085 type:create cxid:0x4b9e9e49
>>>>> zxid:0xfffffffffffffffe txntype:unknown
>>>>> /locks/9871253/lock-8589943989-
>>>>> org.apache.zookeeper.KeeperException$SessionMovedException:
>>>>> KeeperErrorCode = Session moved
>>>>>         at
>>>>>
>>>>> org.apache.zookeeper.server.SessionTrackerImpl.checkSession(SessionTrackerImpl.java:231)
>>>>>         at
>>>>>
>>>>> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProcessor.java:211)
>>>>>         at
>>>>>
>>>>> org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.java:114)
>>>>> 2010-03-16 14:22:06,428 - WARN
>>>>> [ProcessThread:-1:PrepRequestProcessor@402] - Got exception when
>>>>> processing sessionid:0x324dcc1ba580085 type:create cxid:0x4b9f6603
>>>>> zxid:0xfffffffffffffffe txntype:unknown
>>>>> /locks/1665960/lock-8589961006-
>>>>> org.apache.zookeeper.KeeperException$SessionMovedException:
>>>>> KeeperErrorCode = Session moved
>>>>>         at
>>>>>
>>>>> org.apache.zookeeper.server.SessionTrackerImpl.checkSession(SessionTrackerImpl.java:231)
>>>>>         at
>>>>>
>>>>> org.apache.zookeeper.server.PrepRequestProcessor.pRequest(PrepRequestProcessor.java:211)
>>>>>         at
>>>>>
>>>>> org.apache.zookeeper.server.PrepRequestProcessor.run(PrepRequestProcessor.java:114)
>>>>>
>>>>>
>>>>> To workaround disconnections I am going to increase session timeout
>>>>> from 5 to 15 seconds but event if it helps at all it is just a
>>>>> workaround.
>>>>>
>>>>> Do you have an idea where is the source of my problem.
>>>>>
>>>>> Regards, Łukasz Osipiuk
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>
> 
> 
> 

Mime
View raw message