hadoop-zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Rosenstrauch <dar...@darose.net>
Subject Re: Sequence Number Generation With Zookeeper
Date Fri, 06 Aug 2010 02:41:26 GMT
On 08/05/2010 06:31 PM, Jonathan Holloway wrote:
> Hi all,
>
> I'm looking at using Zookeeper for distributed sequence number generation.
>   What's the best way to do this currently?  Is there a particular recipe
> available for this?
>
> My so far involve:
> a) Creating a node with PERSISTENT_SEQUENTIAL then deleting it - this gives
> me the monotonically increasing number, but the sequence number isn't
> contiguous
> b) Storing the sequence number in the data portion of a persistent node -
> then updating this (using the version number - aka optimistic locking).  The
> problem with this is that under high load I'm assuming there'll be a lot of
> contention and hence failures with regards to updates.
>
> What are your thoughts on the above?
>
> Many thanks,
> Jon.

I just ran into this exact situation, and handled it like so:

I wrote a library that uses the option (b) you described above.  Only 
instead of requesting a single sequence number, you request a block of 
them at a time from Zookeeper, and then locally use them up one by one 
from the block you retrieved.  Retrieving by block (e.g., by blocks of 
10000 at a time) eliminates the contention issue.

Then, if you're finished assigning ID's from that block, but still have 
a bunch of ID's left in the block, the library has another function to 
"push back" the unused ID's.  They'll then get pulled again in the next 
block retrieval.

We don't actually have this code running in production yet, so I can't 
vouch for how well it works.  But the design was reviewed and given the 
thumbs up by the core developers on the team, and the implementation 
passes all my unit tests.

HTH.  Feel free to email back with specific questions if you'd like more 
details.

DR

Mime
View raw message