zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robin <rchzz...@163.com>
Subject Re:Re: A suggestion about the design for znode version in ZooKeeper
Date Sun, 16 Nov 2014 02:28:05 GMT
Hi Vamsi and Jurgen,



Thanks for your explanation and I believe that your suggestions will work. However, both of
these solutions need extra complexity for client code. If this can be solved at ZooKeeper
side with little effort, it will be great.


Thanks,
- Cheng Rao
At 2014-11-15 23:38:12, "Vamsi Devaki" <devaki.vamsi@gmail.com> wrote:
>Hi Robin,
>
>One way to work with the situation is to use multi / transaction API. You
>can check the version of the parent and operate on child nodes atomically.
>
>A quick explanation can be found at -
>http://tdunning.blogspot.com/2011/06/tour-of-multi-update-for-zookeeper.html
>
>Regards,
>Vamsi
>
>
>On Sat, Nov 15, 2014 at 2:00 AM, "Jürgen Wagner (DVT)" <
>juergen.wagner@devoteam.com> wrote:
>
>>  Zookeeper uses an optimistic appoach in this case. The "problem" will
>> only occur if you simply use the optimistic mode in your application as
>> well.
>>
>> So, you have to implement a pessimistic version, i.e., create a lock and
>> then perform the update or guarantee otherwise that the required operations
>> will be atomic. In that case, you can guarantee that nobody will delete the
>> node while you're busy with the update.
>>
>> Cheers,
>> --Jürgen
>>
>>
>>
>> On 15.11.2014 10:25, Ivan Kelly wrote:
>>
>> another option would be to start the znode id at the znode id of the
>> parent znode which will be different between each deletion and
>> creation of child nodes. One problem with this though (apart from
>> being limited to 2^31 bits), is that the api doesn't have any way to
>> return the initial znode version on creation. Fixing this, in a
>> backward-compatible, non-ugly way would be hard I think.
>>
>> -Ivan
>>
>> On 15 November 2014 03:48, Robin <rchzzjcn@163.com> <rchzzjcn@163.com>
wrote:
>>
>>  Hi zookeepers,
>>
>> When I dig into ZooKeeper's internals, I have learned the following flaw about znode
version in ZooKeeper: znode's version will be reset when znode is deleted/re-created. This
is a trap for some operations which make updates based on znode version.
>>
>> Let's see an example:  a client gets the data of a znode (e.g, /test)  and version(e.g,
1), change the data of the znode, and writes it back with the condition that the version does
not change (still be 1). If another client deletes and re-creates this znode during the first
client is updating the data, the version matches, but it now contains the wrong data.
>>
>> The problem I can see is that the znode version is designed to be a monotonically
increasing integer. If we can include the birth-date(timestamp) of the znode or zxid for the
creation of the znode as part of the znode's version, and only the integer part of the version
will increase every time when the znode is updated, while keeping the birth-date or zxid part
of the version not change, we can avoid the problem.
>>
>> Of course, there will be some cost for the new design: it needs bigger size for the
version field.
>>
>> Thanks,
>> - Robin
>>
>>
>>
>> --
>>
>> Mit freundlichen Grüßen/Kind regards/Cordialement vôtre/Atentamente/С
>> уважением
>> *i.A. Jürgen Wagner*
>> Head of Competence Center "Intelligence"
>> & Senior Cloud Consultant
>>
>> Devoteam GmbH, Industriestr. 3, 70565 Stuttgart, Germany
>> Phone: +49 6151 868-8725, Fax: +49 711 13353-53, Mobile: +49 171 864 1543
>> E-Mail: juergen.wagner@devoteam.com, URL: www.devoteam.de
>> ------------------------------
>> Managing Board: Jürgen Hatzipantelis (CEO)
>> Address of Record: 64331 Weiterstadt, Germany; Commercial Register:
>> Amtsgericht Darmstadt HRB 6450; Tax Number: DE 172 993 071
>>
>>
>>
>
>
>-- 
>Vamsi
Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message