zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mattias Persson <matt...@neotechnology.com>
Subject Re: Node being there and not at the same time
Date Fri, 31 Aug 2012 07:00:05 GMT
Thanks for your great feedback. I'll find out more about any reconnects
around that time and may post some more questions with some code if there
still seems to be problems.

Best,
Mattias

2012/8/31 Alexander Shraer <shralex@gmail.com>

> This sounds like a good idea. I'm not sure how easy it would be to
> implement as the client may need to be in a new sort of "conditional"
> state.
>
> Alex
>
> On Thu, Aug 30, 2012 at 10:50 PM, Bill Bridge <bill.bridge@oracle.com
> >wrote:
>
> >  Nothing to be sorry about, I was wrong to suggest a client could see an
> > old state by reconnecting. When you said that it should not be allowed I
> > realized that had to be the case. I saw that email too and realized it
> had
> > something to do with this subject.
> >
> > It would seem nicer to simply do a sync() when this happens rather than
> > refusing the connection. We could destroy the connection if the client is
> > still in the future after a sync(). There is something seriously wrong if
> > the client is still in the future after a sync(). If this happened with
> the
> > current code the client would just keep trying until the connection
> finally
> > worked and we would not find out that something is wrong. I suppose the
> > client's last zxid could have been corrupted in his memory causing this
> > problem. It would be good to have this disconnect and fail the client
> > rather than spin.
> >
> > Without the connection you cannot do the sync() yourself. It is
> > conceivable that it will be a few seconds before there is another server
> > that is current enough to connect with. Maybe the other servers are in
> > different data centers and would not be efficient to connect to them.
> >
> > Bill
> >
> > On 8/30/2012 10:21 PM, Alexander Shraer wrote:
> >
> > Bill,
> >
> >  I'm sorry - you were right and I totally quoted the wrong place in the
> > code. The code that ensures that a client doesn't "go back in time" by
> > connecting to a server that is less up to date than that client is most
> > probably this one from ZooKeeperServer.java. I realized it after looking
> on
> > the question of Simon today in the mailing list...
> >
> >       if (connReq.getLastZxidSeen() > zkDb.dataTree.lastProcessedZxid)
> >
> >             String msg = "Refusing session request for client "
> >
> >                 + cnxn.getRemoteSocketAddress()
> >
> >                 + " as it has seen zxid 0x"
> >
> >                 + Long.toHexString(connReq.getLastZxidSeen())
> >
> >                 + " our last zxid is 0x"
> >
> >                 +
> > Long.toHexString(getZKDatabase().getDataTreeLastProcessedZxid())
> >
> >                 + " client must try another server";
> >
> > On Mon, Aug 27, 2012 at 10:22 AM, Bill Bridge <bill.bridge@oracle.com
> >wrote:
> >
> >> Alex,
> >> You certainly know the code much better than I, so I may be mistaken
> >> here. It looks to me like waitForEpochAck() is about changes in the set
> of
> >> peers, and is not related to client connect/disconnects. I do not see
> how
> >> this would be called if a client disconnected due to some problem of his
> >> own, such as too slow to heartbeat, then reconnected to a different
> peer or
> >> observer.
> >>
> >> You suggest that a reconnecting client should ensure the new server has
> >> seen all transactions that the client has seen. This sounds like the
> right
> >> thing to do. This would certainly eliminate the race condition I
> >> postulated. This sounds like the kind of thing someone would have
> already
> >> thought of. If this is not already done then it would be a good change
> to
> >> make. I do not know where the code to do that would be. It could be
> part of
> >> the server reconnect code or it could be a sync() in the client library.
> >>
> >> If Mattias's code creates a new session when reconnecting, rather than
> >> reconnecting to the same session, then he could have the problem
> described
> >> even if reconnect ensures the client is not ahead of the server. He
> could
> >> fix this either by reconnecting to the same session, or simply doing a
> >> sync() when necessary.
> >>
> >> Thanks,
> >> Bill
> >>
> >>
> >> On 8/24/2012 6:11 PM, Alexander Shraer wrote:
> >>
> >>> Bill,  if I understand correctly this shouldn't be possible - the
> >>> client will not be able to connect to a server that is
> >>> less up-to-date than that same client. So if the create completed at
> >>> the client before it disconnects the new server will have to know
> >>> about it too otherwise the connection will fail. See
> >>> Leader.waitForEpochAck:
> >>>
> >>> if (ss.isMoreRecentThan(leaderStateSummary)) {
> >>>                      throw new IOException("Follower is ahead of the
> >>> leader, leader summary: "
> >>>                                                      +
> >>> leaderStateSummary.getCurrentEpoch()
> >>>                                                      + " (current
> >>> epoch), "
> >>>                                                      +
> >>> leaderStateSummary.getLastZxid()
> >>>                                                      + " (last zxid)");
> >>>                  }
> >>>
> >>> of course its possible that another client connected to a different
> >>> server doesn't see the create.
> >>>
> >>> Alex
> >>>
> >>>
> >>> On Fri, Aug 24, 2012 at 5:15 PM, Bill Bridge <bill.bridge@oracle.com>
> >>> wrote:
> >>>
> >>>> Mattias,
> >>>>
> >>>> Is it possible that after you get NODEEXISTS from creation and before
> >>>> you do
> >>>> the second getData(), you reconnect to another ZooKeeper instance? If
> >>>> so,
> >>>> maybe the new connection is to a follower that has not yet seen the
> >>>> creation. If this is what is happening, then a sync() after the second
> >>>> NONODE with a third getData() should work. By only doing the sync()
> >>>> when you
> >>>> hit the unusual race condition it will have no performance impact.
> >>>>
> >>>> Bill
> >>>>
> >>>>
> >>>> On 8/23/2012 8:21 AM, Mattias Persson wrote:
> >>>>
> >>>>> Hi David,
> >>>>>
> >>>>> There is nowhere in the code where that node gets deleted. If we
> >>>>> refrain
> >>>>> from that suspicion, could there be something else?
> >>>>>
> >>>>> 2012/8/23 David Nickerson <davidnickerson4mailinglists@gmail.com>
> >>>>>
> >>>>>  It's a little difficult to guess what your application is doing,
but
> >>>>>> it
> >>>>>> sounds like there's "someone else" who can create and delete
the
> nodes
> >>>>>> you're trying to work with. So when you create the node and
check
> its
> >>>>>> data,
> >>>>>> someone else might have deleted it before you got the chance
to
> check
> >>>>>> the
> >>>>>> data. The same is true when you check that it exists and then
check
> >>>>>> the
> >>>>>> data. You could ensure that the node won't be deleted by using
ACLs
> or
> >>>>>> giving the node a sequential ephemeral child.
> >>>>>>
> >>>>>> On Thu, Aug 23, 2012 at 6:30 AM, Mattias Persson
> >>>>>> <mattias@neotechnology.com>wrote:
> >>>>>>
> >>>>>>  Hi,
> >>>>>>>
> >>>>>>> I've got a problem that I've seen at only a few occasions
and which
> >>>>>>> confuses me a bit. Basically I construct a ZooKeeper client
(I'm
> >>>>>>> running
> >>>>>>> version 3.3.2) where there's a ZK quorum of size 3 running.
I get a
> >>>>>>> SyncConnected event in a Watcher of mine and in that watcher
I do a
> >>>>>>> get-or-create(-if-absent) behaviour where I first do a:
> >>>>>>>
> >>>>>>>     zooKeeper.getData( myPath, false, null );
> >>>>>>>
> >>>>>>> if that produces a NONODE code I'll try to create it with:
> >>>>>>>
> >>>>>>>     zooKeeper.create( myPath, smallByteArray, OPEN_ACL_UNSAFE,
> >>>>>>> PERSISTENT
> >>>>>>>
> >>>>>> );
> >>>>>>
> >>>>>>> If that fails with NODEEXISTS code I'll just get it, assuming
> someone
> >>>>>>>
> >>>>>> else
> >>>>>>
> >>>>>>> made it before me. What I see from this getData call that
I do
> after
> >>>>>>> getting this NODEEXISTS code, which is the same as the first
one
> >>>>>>> btw, is
> >>>>>>> that I'll get a NONODE code back. Given in this scenario
is that
> I'm
> >>>>>>> 100%
> >>>>>>> certain that this node exists in the quorum at myPath in
the first
> >>>>>>> place
> >>>>>>> even.
> >>>>>>>
> >>>>>>> Questions:
> >>>>>>> 1) How can this happen?
> >>>>>>> 2) Do I use ZooKeeper here in an improper way?
> >>>>>>> 3) Will a later version fix any potential issue I might
have hit?
> >>>>>>> 4) What's the guarantees around the state of my ZooKeeper
instance
> >>>>>>> after
> >>>>>>>
> >>>>>> a
> >>>>>>
> >>>>>>> receive a SyncConnected event, is it fully synced with the
master
> at
> >>>>>>> that
> >>>>>>> point, or will a call to sync() be necessary first?
> >>>>>>>
> >>>>>>> Best,
> >>>>>>> Mattias
> >>>>>>>
> >>>>>>> --
> >>>>>>> Mattias Persson, [mattias@neotechnology.com]
> >>>>>>> Hacker, Neo Technology
> >>>>>>> www.neotechnology.com
> >>>>>>>
> >>>>>>>
> >>>>>
> >>
> >
> >
>



-- 
Mattias Persson, [mattias@neotechnology.com]
Hacker, Neo Technology
www.neotechnology.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message