zookeeper-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mahadev Konar <maha...@apache.org>
Subject Re: Question about the Barrier Java example on the ZooKeeper documentation
Date Wed, 09 Mar 2011 16:11:17 GMT
I just added you to the contributors list and assigned the jira to you.

thanks
mahadev

On Wed, Mar 9, 2011 at 1:55 AM, Semih Salihoglu <semih@stanford.edu> wrote:

> I created a bug but I don't see a way to assign it to myself (or anyone
> actually). Here's the link:
> https://issues.apache.org/jira/browse/ZOOKEEPER-1011.
>
> semih
>
>
> On Wed, Mar 9, 2011 at 1:30 AM, Flavio Junqueira <fpj@yahoo-inc.com>wrote:
>
>> Hi Semih, Jira is the system we use to report and discuss zookeeper
>> issues:
>>
>> https://issues.apache.org/jira/browse/ZOOKEEPER
>>
>> Once you have an account, you can create a new issue, describe it, and
>> propose a fix to the problem at hand.
>>
>> -Flavio
>>
>> On Mar 8, 2011, at 10:13 PM, Semih Salihoglu wrote:
>>
>> Sure, I'll get to it this weekend probably.
>>
>> I don't know what jira is so some information of how to do this would be
>> very helpful.
>>
>> Thank you,
>>
>> semih
>>
>> On Tue, Mar 8, 2011 at 8:31 AM, Patrick Hunt <phunt@apache.org> wrote:
>>
>>> On Tue, Mar 8, 2011 at 5:59 AM, Flavio Junqueira <fpj@yahoo-inc.com>wrote:
>>>
>>>> I believe the goal of the examples was never to be a complete solutions
>>>> to barriers or queues, but just to give a quick bootstrap to beginners. It
>>>> is true, though, that the documentation page does not make that claim, and
>>>> can be misleading.
>>>>
>>>> I see two possible action points out of this discussion:
>>>> 1- State clearly in the beginning that the example discussed is not
>>>> correct under the assumption that a process may finish the computation
>>>> before another has started, and the example is there for illustration
>>>> purposes;
>>>> 2- Have another example following the current one that discusses the
>>>> problem and shows how to fix it. This is an interesting option that
>>>> illustrates how one could reason about a solution when developing with
>>>> zookeeper.
>>>>
>>>>
>>> This (2) sounds much better to me. Semih, would you like to give that a
>>> try? (updating the docs I mean)
>>>
>>> Patrick
>>>
>>>
>>>> If you are interested in helping us fix it, Semih, then you could
>>>> perhaps create a jira and assign yourself to fix it. I can help you out.
>>>>
>>>> -Flavio
>>>>
>>>> On Mar 7, 2011, at 11:23 AM, Semih Salihoglu wrote:
>>>>
>>>> Hi Mahadev,
>>>>
>>>> Sorry for the late response. I agree, actually in this other
>>>> documentation
>>>> http://hadoop.apache.org/zookeeper/docs/r3.0.0/recipes.html, where
>>>> there is
>>>> only the pseudo-code, I think this situation is avoided. Here there is
>>>> another znode /ready that all nodes have a watch on. And after each node
>>>> writes their own ephemeral child, they don't wait. They read how many of
>>>> has
>>>> been written and the last one writes the /ready znode and everyone wakes
>>>> up.
>>>> The only race condition in this one is that there can be two nodes
>>>> trying to
>>>> write /ready and only one of them will succeed but this is ok.
>>>>
>>>> Thank you again,
>>>>
>>>> semih
>>>>
>>>> On Sat, Mar 5, 2011 at 6:41 PM, Mahadev Konar <mahadev@apache.org>
>>>> wrote:
>>>>
>>>> Semih,
>>>>
>>>> You pointed it out right. It is possible ot enter into a situation
>>>>
>>>> like that. The recipe does have a bug. It can be fixed with the last
>>>>
>>>> client creating a special znode and every node in the list watching
>>>>
>>>> for that (so itll be an indication for entering the barrier). no?
>>>>
>>>>
>>>> thanks
>>>>
>>>> mahadev
>>>>
>>>>
>>>> On Sat, Mar 5, 2011 at 5:06 PM, Semih Salihoglu <semih@stanford.edu>
>>>>
>>>> wrote:
>>>>
>>>> Hi All,
>>>>
>>>>
>>>> I am new to this group and to ZooKeeper. I was readin the Barrier
>>>>
>>>> tutorial
>>>>
>>>> in one of the ZooKeeper documentations.
>>>>
>>>> http://hadoop.apache.org/zookeeper/docs/current/zookeeperTutorial.html.
>>>>
>>>> A
>>>>
>>>> barrier primitive is exactly how I want to use ZooKeeper. I have a
>>>>
>>>> question
>>>>
>>>> about this example. It's not really a ZooKeeper question, it's more a
>>>>
>>>> question about the Barrier primitive I think. Here it is: In the enter
>>>>
>>>> method of this Barrier implementation below
>>>>
>>>>
>>>> boolean enter() throws KeeperException, InterruptedException{
>>>>
>>>>            zk.create(root + "/" + name, new byte[0],
>>>> Ids.OPEN_ACL_UNSAFE,
>>>>
>>>>                   CreateMode.EPHEMERAL_SEQUENTIAL);
>>>>
>>>>           while (true) {
>>>>
>>>>               synchronized (mutex) {
>>>>
>>>>                    List<String> list = zk.getChildren(root, true);
>>>>
>>>>
>>>>                    if (list.size() < size) {
>>>>
>>>>                       mutex.wait();
>>>>
>>>>                   } else {
>>>>
>>>>                       return true;
>>>>
>>>>                    }
>>>>
>>>>               }
>>>>
>>>>            }
>>>>
>>>>       }
>>>>
>>>>
>>>> could there be a race condition? Let's say there are two
>>>>
>>>> machines/nodes: node1 and node2 that will use this code to synchronize
>>>>
>>>> over ZK. Let's say the following steps take place:
>>>>
>>>>
>>>>
>>>>  1. node1 calls the zk.create method and then reads the number of
>>>>
>>>> children, and sees that it's 1 and starts waiting.
>>>>
>>>>  2. node2 calls the zk.create method (doesn't call the
>>>>
>>>> zk.getChildren method yet, let's say it's very slow)
>>>>
>>>>  3. node1 is notified that the number of children on the znode
>>>>
>>>> changed, it checks that the size is 2 so it leaves the barrier, it
>>>>
>>>> does its work and then leaves the barrier, deleting its node.
>>>>
>>>>  4. node2 calls zk.getChildren and because node1 has already left,
>>>>
>>>> it sees that the number of children is equal to 1. Since node1 will
>>>>
>>>> never enter the barrier again, it will keep waiting.
>>>>
>>>>
>>>> Could this scenario happen? If not, what is preventing this? I haven't
>>>>
>>>> copied the code piece that enters barrier-does work-leaves barrier.
>>>>
>>>> But in the link I pasted above, it's the barrierTest(String args[])
>>>>
>>>> method.
>>>>
>>>>
>>>> Thank you very much in advance,
>>>>
>>>>
>>>> semih
>>>>
>>>>
>>>>
>>>>
>>>>   *flavio*
>>>> *junqueira*
>>>>
>>>> research scientist
>>>>
>>>> fpj@yahoo-inc.com
>>>> direct +34 93-183-8828
>>>>
>>>> avinguda diagonal 177, 8th floor, barcelona, 08018, es
>>>> phone (408) 349 3300    fax (408) 349 3301
>>>>
>>>>
>>>>
>>>
>>
>>   *flavio*
>> *junqueira*
>>
>> research scientist
>>
>> fpj@yahoo-inc.com
>> direct +34 93-183-8828
>>
>> avinguda diagonal 177, 8th floor, barcelona, 08018, es
>> phone (408) 349 3300    fax (408) 349 3301
>>
>>
>>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message