hadoop-zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Dunning <ted.dunn...@gmail.com>
Subject Re: Sequence Number Generation With Zookeeper
Date Wed, 11 Aug 2010 22:23:17 GMT
Can't happen.

In a network partition, the side without a quorum can't update the file
version.

On Wed, Aug 11, 2010 at 3:11 PM, Adam Rosien <adam@rosien.net> wrote:

> What happens during a network partition and different clients are
> incrementing "different" counters, and then the partition goes away?
> Won't (potentially) the same sequence value be given out to two
> clients?
>
> .. Adam
>
> On Thu, Aug 5, 2010 at 5:38 PM, Jonathan Holloway
> <jonathan.holloway@gmail.com> wrote:
> > Hi Ted,
> >
> > Thanks for the comments.
> >
> > I might have overlooked something here, but is it also possible to do the
> > following:
> >
> > 1. Create a PERSISTENT node
> > 2. Have multiple clients set the data on the node, e.g.  Stat stat =
> > zookeeper.setData(SEQUENCE, ArrayUtils.EMPTY_BYTE_ARRAY, -1);
> > 3. Use the version number from stat.getVersion() as the sequence
> (obviously
> > I'm limited to Integer.MAX_VALUE)
> >
> > Are there any weird race conditions involved here which would mean that a
> > client would receive the wrong Stat object back?
> >
> > Many thanks again,
> > Jon.
> >
> > On 5 August 2010 16:09, Ted Dunning <ted.dunning@gmail.com> wrote:
> >
> >> (b)
> >>
> >> BUT:
> >>
> >> Sequential numbering is a special case of "now".  In large diameters,
> now
> >> gets very expensive.  This is a special case of that assertion.  If
> there
> >> is
> >> a way to get away from this presumption of the need for sequential
> >> numbering, you will be miles better off.
> >>
> >> HOWEVER:
> >>
> >> ZK can do better than you suggest.  Incrementing a counter does involve
> >> potential contention, but you will very likely be able to get to pretty
> >> high
> >> rates before the optimistic locking begins to fail.  If you code your
> >> update
> >> with a few tries at full speed followed by some form of retry back-off,
> you
> >> should get pretty close to the best possible performance.
> >>
> >> You might also try building a lock with an ephemeral file before
> updating
> >> the counter.  I would expect that this will be slower than the back-off
> >> option if only because involves more transactions in ZK.  IF you wanted
> to
> >> get too complicated for your own good, you could have a secondary
> strategy
> >> flag that is only sampled by all clients every few seconds and is
> updated
> >> whenever a client needs to back-off more than say 5 steps.  If this flag
> >> has
> >> been updated recently, then clients should switch to the locking
> protocol.
> >>  You might even have several locks so that you don't exclude all other
> >> updaters, merely thin them out a bit.  This flagged strategy would run
> as
> >> fast as optimistic locking as long as optimistic locking is fast and
> then
> >> would limit the total number of transactions needed under very high
> load.
> >>
> >> On Thu, Aug 5, 2010 at 3:31 PM, Jonathan Holloway <
> >> jonathan.holloway@gmail.com> wrote:
> >>
> >> > My so far involve:
> >> > a) Creating a node with PERSISTENT_SEQUENTIAL then deleting it - this
> >> gives
> >> > me the monotonically increasing number, but the sequence number isn't
> >> > contiguous
> >> > b) Storing the sequence number in the data portion of a persistent
> node -
> >> > then updating this (using the version number - aka optimistic
> locking).
> >> >  The
> >> > problem with this is that under high load I'm assuming there'll be a
> lot
> >> of
> >> > contention and hence failures with regards to updates.
> >> >
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message