zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Van Klaveren, Brian N." <b...@slac.stanford.edu>
Subject ZooKeeper client session recovery and watches
Date Mon, 30 Jun 2014 23:58:19 GMT
Hi,

In some testing, it seems like it’s impossible to recover watched events which may have
been triggered across a client session recovery when that recovery happens from another process.
Browsing through server code, it wasn’t apparent to me what would happen with the triggered
watch events in this case.

Steps to show what I mean:

1. Set the data watch, terminate the client the watch originates from:

String node = “/permanent/node1”;

Watcher tmpWatcher = new  Watcher(){
        @Override
        public void process(WatchedEvent event){
            System.out.println(event.toString());
        }
}

ZooKeeper client = new ZooKeeper(endpoints, 40000, tmpWatcher);
client.exists(node, true);

writeSessionId(client.getSessionId());
writeSessionPass(client.getSessionPasswd());
// Both of the previous are logged to disk. Process killed right here, client.close() is NOT
called


2. From another process (or machine), setData is called on /permanent/node1, which would normally
trigger the data watch, but the session hasn’t expired yet.


3. Reestablish session:
Watcher tmpWatcher = new  Watcher(){
        @Override
        public void process(WatchedEvent event){
            System.out.println(event.toString());
        }
}

long sessionId = loadSessionId();
byte[] sessionPass = loadSessionPass();
ZooKeeper client = new ZooKeeper(endpoints, 40000, tmpWatcher,sessionId, sessionPass);

// SyncConnected event triggered.
System.out.println(“same session: “ + (sessionId == client.getSessionId()) );


Step 2 may even be executed after step 3.

Is it possible any way to recover watch events triggered after a client is terminated, but
before (or after) a client is recovered?
If the answer is no, maybe it should be documented that watches are invalid across session
recoveries, as this would probably be the scenario.

I realize some of this could might possibly be implemented in the new 3.5 branch, when it
comes out, now that it has the CHKW command.

Brian
Mime
View raw message