zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Patrick Hunt <ph...@apache.org>
Subject Re: permanent ZSESSIONMOVED
Date Tue, 16 Mar 2010 23:51:33 GMT
It will be good to see the logs, however I had one additional thought.

The leader (the zk leader) is the one checking for session MOVED. It 
keeps track of which server the session is currently attached to and 
will throw the moved exception if the session proposes a request through 
a server other than who the leader thinks is the owner.

I'm wondering, if/when you see this again, if you restart the server 
that the session is attached to (use netstat on the client for this) 
what would happen. The client will re-attach to the cluster, I'm 
wondering if this would fix the problem. (rather than trying to restart 
the client as you have been doing).

Not sure if you can try this (production env?) but it would be an 
interesting additional data point if you can give it a try.



Patrick Hunt wrote:
> Yes, if you search "back" (older entries) in the server log you will be 
> able to see who the leader is, it will say something like "LEADING" or 
> "FOLLOWING", but this may change over time (which is why you need to 
> search "back" as I mention) if leadership within the ZK cluster changes 
> (say due to networking issue). This is why I mention the logs so highly 
> - it really will give us much additional insight into the issue.
> here's an example of a 5 server ensemble:
> phunt@valhalla:~/dev/workspace/zkconf/test5[master]$ egrep LEAD 
> local*/*.log
> localhost:2184/zoo.log:2010-03-16 12:50:13,711 - INFO 
> [QuorumPeer:/0:0:0:0:0:0:0:0:2184:QuorumPeer@632] - LEADING
> phunt@valhalla:~/dev/workspace/zkconf/test5[master]$ egrep FOLLOW 
> local*/*.log
> localhost:2181/zoo.log:2010-03-16 12:50:13,649 - INFO 
> [QuorumPeer:/0:0:0:0:0:0:0:0:2181:QuorumPeer@620] - FOLLOWING
> localhost:2182/zoo.log:2010-03-16 12:50:13,933 - INFO 
> [QuorumPeer:/0:0:0:0:0:0:0:0:2182:QuorumPeer@620] - FOLLOWING
> localhost:2183/zoo.log:2010-03-16 12:50:13,901 - INFO 
> [QuorumPeer:/0:0:0:0:0:0:0:0:2183:QuorumPeer@620] - FOLLOWING
> localhost:2185/zoo.log:2010-03-16 12:50:13,661 - INFO 
> [QuorumPeer:/0:0:0:0:0:0:0:0:2185:QuorumPeer@620] - FOLLOWING
> Additionally if you use the "stat" 4letter word you will see the current 
> status of the server, leader or follower. (JMX as well)
> You might also find this useful: http://github.com/phunt/zktop
> Patrick
> Łukasz Osipiuk wrote:
>> On Tue, Mar 16, 2010 at 20:05, Patrick Hunt <phunt@apache.org> wrote:
>>> We'll probably need the ZK server/client logs to hunt this down. Can you
>>> tell if the MOVED happens in some particular scenario, say you are 
>>> connected
>>> to a follower and move to a leader, or perhaps you are connected to 
>>> server
>>> A, get disconnected and reconnected to server A? .... is there some 
>>> pattern
>>> that could help us understand what's causing this?
>> When I get to office tomorrow I will try to investigate logs and maybe
>> i will be able to find out what the error scenario is.
>> But I am not sure if I will be able to find out what was the role of
>> each node when problem occurred?
>> Does zookeeper server log when node state changes between follower and
>> leader. Or can I make it log it?
>> Regards, Łukasz

View raw message