Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 81BFC9ECE for ; Fri, 4 May 2012 11:06:54 +0000 (UTC) Received: (qmail 92360 invoked by uid 500); 4 May 2012 11:06:54 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 92325 invoked by uid 500); 4 May 2012 11:06:54 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 92316 invoked by uid 99); 4 May 2012 11:06:53 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 May 2012 11:06:53 +0000 X-ASF-Spam-Status: No, hits=-0.5 required=5.0 tests=FREEMAIL_ENVFROM_END_DIGIT,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of grsingh750@gmail.com designates 209.85.160.170 as permitted sender) Received: from [209.85.160.170] (HELO mail-gy0-f170.google.com) (209.85.160.170) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 04 May 2012 11:06:45 +0000 Received: by ghbg2 with SMTP id g2so911676ghb.15 for ; Fri, 04 May 2012 04:06:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=fkmFIbGY+DYmUm/I8hE/50Nx/TTE8aF3SXCx88Z/iVI=; b=T5WkFZnA/DncHqZE4KqAfyrt4OzLyW8+8uB0IWxECPDNPdqyDc//lNqeOLkvJRpzGR vm4qtbcRO25gtwCQ4GWD3gEkxUz89K7fXqVSQOI6+3rRQqhPgOS1Pgk/ZmujlQDwDxse amVrb89vOUirDRZCdUJuIiBoc5hBQI6wU1Z+pzlZem8u9LQN97FjC9vexZawXro/Cyrs BW8+V6AOmz+376NETtO7vRT9y7tO3pOqV1PnU/ASfUyVPIWgdIMgCR+L2DZ7DSImO4Xr 1UhE2++Ifs6HQneHTDtfK+jhEmEw/O8dqIduYzZjD15N9mv9OGNVcSx2NMfmsFHLXCX1 DXLQ== MIME-Version: 1.0 Received: by 10.43.58.73 with SMTP id wj9mr2760123icb.17.1336129584810; Fri, 04 May 2012 04:06:24 -0700 (PDT) Received: by 10.231.46.81 with HTTP; Fri, 4 May 2012 04:06:24 -0700 (PDT) Date: Fri, 4 May 2012 16:36:24 +0530 Message-ID: Subject: Watch not sent immediately? From: guru singh To: user@zookeeper.apache.org Content-Type: text/plain; charset=ISO-8859-1 Hi, Sorry if the subject is not appropriately titled. I'm trying to implement a redis-failover solution using zookeeper. I've been working with the python binding for zk Basically, I have a znode called /master, a watch is set on this so that, whenever master changes, self.master is upated There is another znode called /errors, a watch is set on this via get_children to errors_watcher function. My code is supposed to continuously loop and create a childe znode on /errors, whenever an error is detected. The function errors_watcher, counts the number of children for znode /errors, if it exceeds a certain length, it writes a new master 'ip:port' to the znode /master, this calls the master watcher and updates self.master. I use python's threading.Condition() to block for certain operations, for instance initially when znode /master is created, I wait() for master_watcher to be called which updates self.master and releases the lock. This works as expected, however the problem is that when znode /master is changed from within errors_watcher, if I wait() for master_watcher to be called, updating self.master and then releasing the lock. The code just keeps waiting, the master_watcher is never called. However, if I don't wait after setting znode /master from within errors_watcher, master_watcher is called and it updates self.master. It'll be really helpful if somebody could point out what's wrong? Is it zk or is my understanding of threading.Condition() incorrect? Or both :) Thanks for your help This code snippet below, simulates the problem. class ZKtest: def __init__(self,zk_server): zk.set_log_stream(open('zk.log','w')) self.master = None self.zk_server = zk_server self.connected = False self.conn_cv = threading.Condition() def global_watcher(self,handle,event,state,path): self.conn_cv.acquire() print 'global watcher called' self.connected = True self.conn_cv.notifyAll() self.conn_cv.release() def master_watcher(self,handle,event,state,path): self.conn_cv.acquire() print 'master watcher called' master = zk.get(self.handle,path,self.master_watcher)[0] self.master = master print 'Master is %s' %(master) self.conn_cv.notifyAll() self.conn_cv.release() def errors_watcher(self,handle,event,state,path): self.conn_cv.acquire() print 'error watcher called' errors = len(zk.get_children(self.handle,'/errors',self.errors_watcher)) print 'Current errors %d' %(errors) if errors > 5 : print 'Set new master, update znode /master' zk.set(self.handle,'/master','127.0.0.1:6380') #self.conn_cv.wait() <-- Why doesn't this return?? self.conn_cv.notifyAll() self.conn_cv.release() def create_znodes(self): self.conn_cv.acquire() master = zk.exists(self.handle,'/master',self.master_watcher) if not master: print 'Creating znode /master' zk.create(self.handle,'/master','127.0.0.1:6379', [ZOO_OPEN_ACL_UNSAFE]) else : print 'Updating znode /master' zk.set(self.handle,'/master','127.0.0.1:6379',master['version']) self.conn_cv.wait() # wait until master_watcher has updated self.master, this returns after master_watcher is called print self.master # should not be None, since master_watcher updates it errors = zk.exists(self.handle,'/errors') if not errors: print 'Creating znode /errors' zk.create(self.handle,'/errors','Errors follow', [ZOO_OPEN_ACL_UNSAFE]) else : print 'Purge previous errors' for err in zk.get_children(self.handle,'/errors'): zk.delete(self.handle,'/errors/'+err) err = zk.get_children(self.handle,'/errors',self.errors_watcher) # set a watch for children of znode /errors self.conn_cv.release() def run(self): self.conn_cv.acquire() self.handle = zk.init(self.zk_server,self.global_watcher) if not self.connected: while not self.connected : print 'Not Connected, retry in 5' self.conn_cv.wait(5) self.handle = zk.init(self.zk_server) self.create_znodes() while self.master != '127.0.0.1:6380': print 'Current Master %s' %(self.master) # simulate errors, until master is not 127.0.0.1:6380 zk.create(self.handle,'/errors/','Error!',[ZOO_OPEN_ACL_UNSAFE], zk.SEQUENCE) self.conn_cv.wait() self.conn_cv.release() if __name__ == '__main__' : zkt = ZKtest('127.0.0.1:2181') zkt.run()