zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Alexander Shraer <shra...@gmail.com>
Subject Re: Node being there and not at the same time
Date Fri, 31 Aug 2012 05:21:03 GMT

I'm sorry - you were right and I totally quoted the wrong place in the
code. The code that ensures that a client doesn't "go back in time" by
connecting to a server that is less up to date than that client is most
probably this one from ZooKeeperServer.java. I realized it after looking on
the question of Simon today in the mailing list...

     if (connReq.getLastZxidSeen() > zkDb.dataTree.lastProcessedZxid)

            String msg = "Refusing session request for client "

                + cnxn.getRemoteSocketAddress()

                + " as it has seen zxid 0x"

                + Long.toHexString(connReq.getLastZxidSeen())

                + " our last zxid is 0x"


                + " client must try another server";

On Mon, Aug 27, 2012 at 10:22 AM, Bill Bridge <bill.bridge@oracle.com>wrote:

> Alex,
> You certainly know the code much better than I, so I may be mistaken here.
> It looks to me like waitForEpochAck() is about changes in the set of peers,
> and is not related to client connect/disconnects. I do not see how this
> would be called if a client disconnected due to some problem of his own,
> such as too slow to heartbeat, then reconnected to a different peer or
> observer.
> You suggest that a reconnecting client should ensure the new server has
> seen all transactions that the client has seen. This sounds like the right
> thing to do. This would certainly eliminate the race condition I
> postulated. This sounds like the kind of thing someone would have already
> thought of. If this is not already done then it would be a good change to
> make. I do not know where the code to do that would be. It could be part of
> the server reconnect code or it could be a sync() in the client library.
> If Mattias's code creates a new session when reconnecting, rather than
> reconnecting to the same session, then he could have the problem described
> even if reconnect ensures the client is not ahead of the server. He could
> fix this either by reconnecting to the same session, or simply doing a
> sync() when necessary.
> Thanks,
> Bill
> On 8/24/2012 6:11 PM, Alexander Shraer wrote:
>> Bill,  if I understand correctly this shouldn't be possible - the
>> client will not be able to connect to a server that is
>> less up-to-date than that same client. So if the create completed at
>> the client before it disconnects the new server will have to know
>> about it too otherwise the connection will fail. See
>> Leader.waitForEpochAck:
>> if (ss.isMoreRecentThan(**leaderStateSummary)) {
>>                      throw new IOException("Follower is ahead of the
>> leader, leader summary: "
>>                                                      +
>> leaderStateSummary.**getCurrentEpoch()
>>                                                      + " (current epoch),
>> "
>>                                                      +
>> leaderStateSummary.**getLastZxid()
>>                                                      + " (last zxid)");
>>                  }
>> of course its possible that another client connected to a different
>> server doesn't see the create.
>> Alex
>> On Fri, Aug 24, 2012 at 5:15 PM, Bill Bridge <bill.bridge@oracle.com>
>> wrote:
>>> Mattias,
>>> Is it possible that after you get NODEEXISTS from creation and before
>>> you do
>>> the second getData(), you reconnect to another ZooKeeper instance? If so,
>>> maybe the new connection is to a follower that has not yet seen the
>>> creation. If this is what is happening, then a sync() after the second
>>> NONODE with a third getData() should work. By only doing the sync() when
>>> you
>>> hit the unusual race condition it will have no performance impact.
>>> Bill
>>> On 8/23/2012 8:21 AM, Mattias Persson wrote:
>>>> Hi David,
>>>> There is nowhere in the code where that node gets deleted. If we refrain
>>>> from that suspicion, could there be something else?
>>>> 2012/8/23 David Nickerson <davidnickerson4mailinglists@**gmail.com<davidnickerson4mailinglists@gmail.com>
>>>> >
>>>>  It's a little difficult to guess what your application is doing, but it
>>>>> sounds like there's "someone else" who can create and delete the nodes
>>>>> you're trying to work with. So when you create the node and check its
>>>>> data,
>>>>> someone else might have deleted it before you got the chance to check
>>>>> the
>>>>> data. The same is true when you check that it exists and then check the
>>>>> data. You could ensure that the node won't be deleted by using ACLs or
>>>>> giving the node a sequential ephemeral child.
>>>>> On Thu, Aug 23, 2012 at 6:30 AM, Mattias Persson
>>>>> <mattias@neotechnology.com>**wrote:
>>>>>  Hi,
>>>>>> I've got a problem that I've seen at only a few occasions and which
>>>>>> confuses me a bit. Basically I construct a ZooKeeper client (I'm
>>>>>> running
>>>>>> version 3.3.2) where there's a ZK quorum of size 3 running. I get
>>>>>> SyncConnected event in a Watcher of mine and in that watcher I do
>>>>>> get-or-create(-if-absent) behaviour where I first do a:
>>>>>>     zooKeeper.getData( myPath, false, null );
>>>>>> if that produces a NONODE code I'll try to create it with:
>>>>>>     zooKeeper.create( myPath, smallByteArray, OPEN_ACL_UNSAFE,
>>>>> );
>>>>>> If that fails with NODEEXISTS code I'll just get it, assuming someone
>>>>> else
>>>>>> made it before me. What I see from this getData call that I do after
>>>>>> getting this NODEEXISTS code, which is the same as the first one
>>>>>> is
>>>>>> that I'll get a NONODE code back. Given in this scenario is that
>>>>>> 100%
>>>>>> certain that this node exists in the quorum at myPath in the first
>>>>>> place
>>>>>> even.
>>>>>> Questions:
>>>>>> 1) How can this happen?
>>>>>> 2) Do I use ZooKeeper here in an improper way?
>>>>>> 3) Will a later version fix any potential issue I might have hit?
>>>>>> 4) What's the guarantees around the state of my ZooKeeper instance
>>>>>> after
>>>>> a
>>>>>> receive a SyncConnected event, is it fully synced with the master
>>>>>> that
>>>>>> point, or will a call to sync() be necessary first?
>>>>>> Best,
>>>>>> Mattias
>>>>>> --
>>>>>> Mattias Persson, [mattias@neotechnology.com]
>>>>>> Hacker, Neo Technology
>>>>>> www.neotechnology.com

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message