zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Martin Waite <waite....@googlemail.com>
Subject Re: how to lock one-of-many ?
Date Wed, 24 Feb 2010 19:30:44 GMT
Hi Patrick,

Thanks for the info - the Fallacies link especially.  As you might have
guessed, I am one the programmers new to distributed computing who is very
much in danger of messing things up.

I am going to have to knuckle down and do some experiments.  Thankfully, I
don't think my requirements will stretch Zookeeper even if I take a heavy
handed approach.


On 24 February 2010 16:53, Patrick Hunt <phunt@apache.org> wrote:

> Martin Waite wrote:
>> The watch mechanism is a new feature for me.  This gives me a delayed
>> notification that something changed in the lock directory, and so is the
>> earliest time that it makes sense to retry my lock acquistion.  However,
>> given the time-delay in getting the notification, the freed lock might
>> have
>> be acquired by someone else before I get there.   In which case, I might
>> as
>> well just keep trying to acquire locks at random until my time budget is
>> exhausted and not bother with the watch ?
> I don't see the benefit of what Mahadev/Ted are suggesting vs Martin's
> original proposal. Perhaps I'm missing something, please correct me if I'm
> wrong but it seems to me that you want two "lists"; a list of resources and
> a list of locks. Resources might be added or removed dynamically over time
> (assuming they are not known a priori), locks are short lived and exclusive.
> To me this suggests:
> /resources/resource_###  (ephem? owned by the resource itself)
> /locks/resource_###   (ephem)
> where the available resources are managed by adding/removing from
> /resources. Anyone interested in locking an explicit resource attempts to
> create an ephemeral node in /locks with the same ### as they resource they
> want access to. If interested in just getting "any" resource then you would
> getchildren(/resources) and getchildren(/locks) and attempt to lock anything
> not in the intersection (avail). This could be done efficiently since
> resources won't change much, just cache the results of getchildren and set a
> watch at the same time. To lock a resource randomize "avail" and attempt to
> lock each in turn. If all avail fail to acq the lock, then have some random
> holdoff time, then re-getchildren(locks) and start over.
> Distributed computing is inherently "delayed" http://bit.ly/chhFrS right?
> ;-) The benefit of the watch is typically that it minimizes load on the
> service - notification vs polling.
>  Are watches triggered as soon as the primary controller applies a change
>> to
>> an object - or are they delivered whenever the client's local zk instance
>> replicates the change at some later time ?
> They are not synchonous in the sense you mean. You are guaranteed that all
> clients see all changes in the same order, but not
> synchronously/instantaneously.
> This stackoverflow page has some good detail, see Ben's comment here:
> http://bit.ly/aaMzHY
>  Is there a feature to introduce deliberate lag between the primary and its
>> replicas in the ensemble - for development purposes ?  That could be
>> useful
>> for exposing latency assumptions.
> No feature but it does sound interesting. Are there any tools that allow
> one to setup "slow pipes" ala stunnel but here for latency not encryp? I
> believe freebsd has this feature at the os (firewall?) level, I don't know
> if linux does.
> Patrick
>> On 24 February 2010 06:05, Ted Dunning <ted.dunning@gmail.com> wrote:
>>  You have to be careful there of race conditions.  ZK's slightly
>>> surprising
>>> API makes it pretty easy to get this right, however.
>>> The correct way to do what you suggest is to read the list of children in
>>> the locks directory and put a watch on the directory at the same time.
>>>  If
>>> the number of locks equals the number of resources, you wait.  If it is
>>> less, you can randomly pick one of the apparently unlocked resources at
>>> random.  If you fail, start again by checking the number of resources.
>>> On Tue, Feb 23, 2010 at 9:09 PM, Martin Waite <waite.134@googlemail.com
>>>> wrote:
>>>> I guess another optimisation might be to count the number of locks held
>>>> first:  if the count equals the number of resources, try again later.
>>>  But
>>>> I
>>>> suppose that might require a sync call first to ensure that zk instance
>>> my
>>>> client is connected to is up to date.
>>> --
>>> Ted Dunning, CTO
>>> DeepDyve

  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message