zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Shraer <shra...@gmail.com>
Subject Re: Node being there and not at the same time
Date Sat, 25 Aug 2012 01:11:23 GMT
Bill,  if I understand correctly this shouldn't be possible - the
client will not be able to connect to a server that is
less up-to-date than that same client. So if the create completed at
the client before it disconnects the new server will have to know
about it too otherwise the connection will fail. See

if (ss.isMoreRecentThan(leaderStateSummary)) {
                    throw new IOException("Follower is ahead of the
leader, leader summary: "
                                                    + " (current epoch), "
                                                    + " (last zxid)");

of course its possible that another client connected to a different
server doesn't see the create.


On Fri, Aug 24, 2012 at 5:15 PM, Bill Bridge <bill.bridge@oracle.com> wrote:
> Mattias,
> Is it possible that after you get NODEEXISTS from creation and before you do
> the second getData(), you reconnect to another ZooKeeper instance? If so,
> maybe the new connection is to a follower that has not yet seen the
> creation. If this is what is happening, then a sync() after the second
> NONODE with a third getData() should work. By only doing the sync() when you
> hit the unusual race condition it will have no performance impact.
> Bill
> On 8/23/2012 8:21 AM, Mattias Persson wrote:
>> Hi David,
>> There is nowhere in the code where that node gets deleted. If we refrain
>> from that suspicion, could there be something else?
>> 2012/8/23 David Nickerson <davidnickerson4mailinglists@gmail.com>
>>> It's a little difficult to guess what your application is doing, but it
>>> sounds like there's "someone else" who can create and delete the nodes
>>> you're trying to work with. So when you create the node and check its
>>> data,
>>> someone else might have deleted it before you got the chance to check the
>>> data. The same is true when you check that it exists and then check the
>>> data. You could ensure that the node won't be deleted by using ACLs or
>>> giving the node a sequential ephemeral child.
>>> On Thu, Aug 23, 2012 at 6:30 AM, Mattias Persson
>>> <mattias@neotechnology.com>wrote:
>>>> Hi,
>>>> I've got a problem that I've seen at only a few occasions and which
>>>> confuses me a bit. Basically I construct a ZooKeeper client (I'm running
>>>> version 3.3.2) where there's a ZK quorum of size 3 running. I get a
>>>> SyncConnected event in a Watcher of mine and in that watcher I do a
>>>> get-or-create(-if-absent) behaviour where I first do a:
>>>>    zooKeeper.getData( myPath, false, null );
>>>> if that produces a NONODE code I'll try to create it with:
>>>>    zooKeeper.create( myPath, smallByteArray, OPEN_ACL_UNSAFE, PERSISTENT
>>> );
>>>> If that fails with NODEEXISTS code I'll just get it, assuming someone
>>> else
>>>> made it before me. What I see from this getData call that I do after
>>>> getting this NODEEXISTS code, which is the same as the first one btw, is
>>>> that I'll get a NONODE code back. Given in this scenario is that I'm
>>>> 100%
>>>> certain that this node exists in the quorum at myPath in the first place
>>>> even.
>>>> Questions:
>>>> 1) How can this happen?
>>>> 2) Do I use ZooKeeper here in an improper way?
>>>> 3) Will a later version fix any potential issue I might have hit?
>>>> 4) What's the guarantees around the state of my ZooKeeper instance after
>>> a
>>>> receive a SyncConnected event, is it fully synced with the master at
>>>> that
>>>> point, or will a call to sync() be necessary first?
>>>> Best,
>>>> Mattias
>>>> --
>>>> Mattias Persson, [mattias@neotechnology.com]
>>>> Hacker, Neo Technology
>>>> www.neotechnology.com

View raw message