zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeremy Stribling <st...@nicira.com>
Subject missing data after restarting+expanding a ZK 3.4.0 cluster
Date Tue, 06 Dec 2011 01:47:00 GMT
I've been trying to update to ZK 3.4.0 and have had some issues where 
some data become inaccessible after adding a node to a cluster.  My use 
case is a bit strange (as explained before on this list) in that I try 
to grow the cluster dynamically by having an external program 
automatically restart Zookeeper servers in a controlled way whenever the 
list of participating ZK servers needs to change.  I haven't made a JIRA 
for this yet, since I'm guessing the official position is that ZK 
doesn't support this scenario yet, but this used to work just fine in 
3.3.3 (and before), so this represents a regression.

The scenario I see is this:

1) Start up a 1-server ZK cluster (the server has ZK ID 0).
2) A client connects to the server, and makes a bunch of znodes, in 
particular a znode called "/membership".
3) Shut down the cluster.
4) Bring up a 2-server ZK cluster, including the original server 0 with 
its existing data, and a new server with ZK ID 1.
5) Node 0 has the highest zxid and is elected leader.
6) A client connecting to server 1 tries to "get /membership" and gets 
back a -101 error code (no such znode).
7) The same client then tries to "create /membership" and gets back a 
-110 error code (znode already exists).
8) Clients connecting to server 0 can successfully "get /membership".

I've attached a tarball with debug logs for both servers, annotating 
where steps #1 and #4 happen.  You can see that the election involves a 
proposal for zxid 110 from server 0, but immediately following the 
election server 1 has these lines:

2011-12-05 17:18:48,308 9299 [QuorumPeer[myid=1]/] WARN 
org.apache.zookeeper.server.quorum.Learner  - Got zxid 0x100000001 
expected 0x1
2011-12-05 17:18:48,313 9304 [SyncThread:1] INFO 
org.apache.zookeeper.server.persistence.FileTxnLog  - Creating new log 
file: log.100000001

Perhaps that's not relevant, but it struck me as odd.  At the end of 
server 1's log you can see a repeated cycle of getData->create->getData 
as the client tries to make sense of the inconsistent responses.

The other piece of information is that if I try to use the on-disk 
directories for either of the servers to start a new one-node ZK 
cluster, all the data are accessible.

Anyone have ideas?  I haven't tried writing a program outside of my 
application to reproduce this, but I can do it very easily with some of 
my app's tests if anyone needs more information.  I happy to turn this 
into a JIRA if desired.  Thanks,


View raw message