zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robin <rchzz...@163.com>
Subject A suggestion about the design for znode version in ZooKeeper
Date Sat, 15 Nov 2014 02:48:47 GMT
Hi zookeepers,

When I dig into ZooKeeper's internals, I have learned the following flaw about znode version
in ZooKeeper: znode's version will be reset when znode is deleted/re-created. This is a trap
for some operations which make updates based on znode version. 

Let's see an example:  a client gets the data of a znode (e.g, /test)  and version(e.g, 1),
change the data of the znode, and writes it back with the condition that the version does
not change (still be 1). If another client deletes and re-creates this znode during the first
client is updating the data, the version matches, but it now contains the wrong data.

The problem I can see is that the znode version is designed to be a monotonically increasing
integer. If we can include the birth-date(timestamp) of the znode or zxid for the creation
of the znode as part of the znode's version, and only the integer part of the version will
increase every time when the znode is updated, while keeping the birth-date or zxid part of
the version not change, we can avoid the problem.

Of course, there will be some cost for the new design: it needs bigger size for the version

- Robin 
View raw message