zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From David Rosenstrauch <dar...@darose.net>
Subject Re: counter with zookeeper
Date Thu, 02 Dec 2010 15:24:53 GMT
We're using ZK to implement something similar.  We have a need for a 
Hadoop job to assign new ID's a) without hitting a database, and b) 
ensuring that the ID's assigned are unique (i.e., that the numerous 
simultaneous tasks in the Hadoop job don't contend with each other 
and/or corrupt the "next ID value").  So we wrote a small library on top 
of ZK to do this, and it's working out quite nicely.  See: 
http://mail-archives.apache.org/mod_mbox/hadoop-zookeeper-user/201008.mbox/%3C4C5B7656.4020200@darose.net%3E

for details.

I had been planning to release this as open source to the community 
(see: 
http://mail-archives.apache.org/mod_mbox/hadoop-zookeeper-user/201008.mbox/%3C4C618579.5060806@darose.net%3E)

- and still am.  Just haven't quite gotten around to cleaning it up for 
release yet.

DR

On 12/02/2010 09:29 AM, Claudio Martella wrote:
> Hi,
>
> I'm trying to implement a String->Long dictionary, as I'm doing text
> processing in M/R and would like to speed up my things.
> In order to implement the mapping, I need to access a high speed atomic
> counter that allows me to pick the latest used Long, increment it and
> use it for the latest-discovered new word to put in the dictionary.
>
> At first i thought about using a regular sequential znode and use the
> sequence number as the counter value, but I realize the sequence number
> is an int, while i'd like a long. Is that correct? I'm refering to
> Stat.getVersion() in the API.
>
> In case this strategy is unfeasible, the second possibility is to use a
> WriteLock to "/counter" to control access the payload of the znode,
> where i'd put the counter value, or access to a special row in
> cassandra, where i'd put the counter value. The Cassandra option is
> probably the best possibility, as i'm storing my dictionary there
> anyway, but I'd like to hear from you about latency and performance for
> this options in ZK.
>
>
> Thanks
>
> Claudio

Mime
View raw message