Mailing-List: contact zookeeper-user-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Reply-To: zookeeper-user@hadoop.apache.org
Received-SPF: pass (nike.apache.org: domain of waite.134@googlemail.com
 designates 209.85.220.213 as permitted sender)
DomainKey-Signature: a=rsa-sha1; c=nofws;
        d=googlemail.com; s=gamma;
        h=mime-version:in-reply-to:references:date:message-id:subject:from:to
         :content-type;
        b=CJvySLynOUnRhHhdLMANkqeujqDO7C8V6fhZLJm8hHimSQqkkUyquNEdbVAM7h4HII
         emg9Em17Zd/DDXmdt0zMz5XRxDl4Pfq2WDexo5rqUytqVasR5qfoR6lkftmTMu/NGXj2
         lxIkCi+FLz6wSMFjnAM5ZdA028ADQa1W5xHRQ=
MIME-Version: 1.0
In-Reply-To: <4B855983.1070305@apache.org>
References: <8bc75ecf1002230405u4f28a5f4q4a25348b35af3671@mail.gmail.com>
	 <C7A95548.2FC7F%mahadev@yahoo-inc.com>
	 <8bc75ecf1002232109h7132adfdr8b27c92b34fb179a@mail.gmail.com>
	 <c7d45fc71002232205h33964f59u99d34f7f28b5b07@mail.gmail.com>
	 <8bc75ecf1002240325o5a7e2c9al18a6e86877088575@mail.gmail.com>
	 <4B855983.1070305@apache.org>
Date: Wed, 24 Feb 2010 19:30:44 +0000
Message-ID: <8bc75ecf1002241130o262df079p6364c5c8932b47a1@mail.gmail.com>
Subject: Re: how to lock one-of-many ?
From: Martin Waite <waite.134@googlemail.com>
To: zookeeper-user@hadoop.apache.org
Content-Type: multipart/alternative; boundary=0016e65aed2cdeaf1604805db350

--0016e65aed2cdeaf1604805db350
Content-Type: text/plain; charset=ISO-8859-1

Hi Patrick,

Thanks for the info - the Fallacies link especially.  As you might have
guessed, I am one the programmers new to distributed computing who is very
much in danger of messing things up.

I am going to have to knuckle down and do some experiments.  Thankfully, I
don't think my requirements will stretch Zookeeper even if I take a heavy
handed approach.

regards,
Martin

On 24 February 2010 16:53, Patrick Hunt <phunt@apache.org> wrote:

>
> Martin Waite wrote:
>
>> The watch mechanism is a new feature for me.  This gives me a delayed
>> notification that something changed in the lock directory, and so is the
>> earliest time that it makes sense to retry my lock acquistion.  However,
>> given the time-delay in getting the notification, the freed lock might
>> have
>> be acquired by someone else before I get there.   In which case, I might
>> as
>> well just keep trying to acquire locks at random until my time budget is
>> exhausted and not bother with the watch ?
>>
>>
> I don't see the benefit of what Mahadev/Ted are suggesting vs Martin's
> original proposal. Perhaps I'm missing something, please correct me if I'm
> wrong but it seems to me that you want two "lists"; a list of resources and
> a list of locks. Resources might be added or removed dynamically over time
> (assuming they are not known a priori), locks are short lived and exclusive.
> To me this suggests:
>
> /resources/resource_###  (ephem? owned by the resource itself)
> /locks/resource_###   (ephem)
>
> where the available resources are managed by adding/removing from
> /resources. Anyone interested in locking an explicit resource attempts to
> create an ephemeral node in /locks with the same ### as they resource they
> want access to. If interested in just getting "any" resource then you would
> getchildren(/resources) and getchildren(/locks) and attempt to lock anything
> not in the intersection (avail). This could be done efficiently since
> resources won't change much, just cache the results of getchildren and set a
> watch at the same time. To lock a resource randomize "avail" and attempt to
> lock each in turn. If all avail fail to acq the lock, then have some random
> holdoff time, then re-getchildren(locks) and start over.
>
> Distributed computing is inherently "delayed" http://bit.ly/chhFrS right?
> ;-) The benefit of the watch is typically that it minimizes load on the
> service - notification vs polling.
>
>
>  Are watches triggered as soon as the primary controller applies a change
>> to
>> an object - or are they delivered whenever the client's local zk instance
>> replicates the change at some later time ?
>>
>>
> They are not synchonous in the sense you mean. You are guaranteed that all
> clients see all changes in the same order, but not
> synchronously/instantaneously.
>
> This stackoverflow page has some good detail, see Ben's comment here:
> http://bit.ly/aaMzHY
>
>
>  Is there a feature to introduce deliberate lag between the primary and its
>> replicas in the ensemble - for development purposes ?  That could be
>> useful
>> for exposing latency assumptions.
>>
>>
> No feature but it does sound interesting. Are there any tools that allow
> one to setup "slow pipes" ala stunnel but here for latency not encryp? I
> believe freebsd has this feature at the os (firewall?) level, I don't know
> if linux does.
>
> Patrick
>
>
>
>> On 24 February 2010 06:05, Ted Dunning <ted.dunning@gmail.com> wrote:
>>
>>  You have to be careful there of race conditions.  ZK's slightly
>>> surprising
>>> API makes it pretty easy to get this right, however.
>>>
>>> The correct way to do what you suggest is to read the list of children in
>>> the locks directory and put a watch on the directory at the same time.
>>>  If
>>> the number of locks equals the number of resources, you wait.  If it is
>>> less, you can randomly pick one of the apparently unlocked resources at
>>> random.  If you fail, start again by checking the number of resources.
>>>
>>> On Tue, Feb 23, 2010 at 9:09 PM, Martin Waite <waite.134@googlemail.com
>>>
>>>> wrote:
>>>> I guess another optimisation might be to count the number of locks held
>>>> first:  if the count equals the number of resources, try again later.
>>>>
>>>  But
>>>
>>>> I
>>>> suppose that might require a sync call first to ensure that zk instance
>>>>
>>> my
>>>
>>>> client is connected to is up to date.
>>>>
>>>>
>>>
>>> --
>>> Ted Dunning, CTO
>>> DeepDyve
>>>
>>>
>>

--0016e65aed2cdeaf1604805db350--