zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Vishal Kher <vishalm...@gmail.com>
Subject Re: znode metadata consistency
Date Tue, 01 Mar 2011 20:40:43 GMT
Hi Jermy,

One of the main reasons for 3.3.3 release was to include fixes for znode
inconsistency bugs.
Have you taken a look at https://issues.apache.org/jira/browse/ZOOKEEPER-962and
https://issues.apache.org/jira/browse/ZOOKEEPER-919?
The problem that you are seeing sounds similar to the ones reported.

-Vishal



On Mon, Feb 28, 2011 at 8:04 PM, Jeremy Stribling <strib@nicira.com> wrote:

> Hi all,
>
> A while back I noticed that my Zookeeper cluster got into a state where I
> would get a "node exists" error back when creating a sequential znode -- see
> the thread starting at
> http://mail-archives.apache.org/mod_mbox/hadoop-zookeeper-user/201010.mbox/%3C4CCA1E2F.9020606@nicira.com%3Efor
more details.  The summary is that at the time, my application had a bug
> that could have been improperly bringing new nodes into a cluster.
>
> However, I've seen this a couple more times since fixing that original bug.
>  I don't yet know how to reproduce it, but I am going to keep trying.  In
> one case, we restarted a node (in a one-node cluster), and when it came back
> up we could no longer create sequential nodes on a certain parent node, with
> a node exists (-110) error code.  The biggest child it saw on restart was
> /zkrsm/000000000000002d_record0000120804 (i.e., a sequence number of
> 120804), however a stat on the parent node revealed that the cversion was
> only 120710:
>
> [zk:<ip:port>(CONNECTED) 3] stat /zkrsm
> cZxid = 0x5
> ctime = Mon Jan 17 18:28:19 PST 2011
> mZxid = 0x5
> mtime = Mon Jan 17 18:28:19 PST 2011
> pZxid = 0x1d819
> cversion = 120710
> dataVersion = 0
> aclVersion = 0
> ephemeralOwner = 0x0
> dataLength = 0
> numChildren = 2955
>
> So my question is: how is znode metadata persisted with respect to the
> actual znodes?  Is it possible that a node's children will get synced to
> disk before its own metadata, and if it crashes at a bad time, the metadata
> updates will be lost?  If so, is there any way to constrain Zookeeper so
> that it will sync its metadata before returning success for write
> operations?
>
> (I'm using Zookeeper 3.3.2 on a Debian Squeeze 64-bit box, with
> openjdk-6-jre 6b18-1.8.3-2.)
>
> I'd be happy to create a JIRA for this if that seems useful, but without a
> way to reproduce it I'm not sure that it is.
>
> Thanks,
>
> Jeremy
>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message