zookeeper-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kathryn Hogg (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (ZOOKEEPER-645) Bug in WriteLock recipe implementation?
Date Tue, 13 Feb 2018 22:47:00 GMT

    [ https://issues.apache.org/jira/browse/ZOOKEEPER-645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16363174#comment-16363174
] 

Kathryn Hogg edited comment on ZOOKEEPER-645 at 2/13/18 10:46 PM:
------------------------------------------------------------------

I'm getting WriteLocks never being granted on 3.4.11 and initially brought it up on the user
email list.  I'm working with ZookeeperNetEx on C# but have verified the code behaves the
same on Java.

I've encountered two issues:
 # When setting the watch on the predecessor, its possible that the predecessor has been deleted
between the time we acquired the children and set the watch.  If this happens, their is no
watch and we exit out of the loop.  We should set id to null in this case to ensure the loop
doesn't terminate.
 # Still need the change to ensure the dir name is prepended to the name returned from getChildren
call in findPrefixInChildren.

#1 change
{code:java}
Stat stat = zookeeper.exists(lastChildId, new LockWatcher());
if (stat != null) {
   return Boolean.FALSE;
} else {
    LOG.warn("Could not find the" +
" stats for less than me: " + lastChildName.getName());
}
{code}
to
{code:java}
Stat stat = zookeeper.exists(lastChildId, new LockWatcher());
if (stat != null) {
    return Boolean.FALSE;
} else {
    LOG.warn("Could not find the" +
" stats for less than me: " + lastChildName.getName());
    id = null;
}
{code}
 

I've been running with these changes on 3.4.11 with two processes contending for 3 different
locks and so far no hangs like I was seeing consistently without them.


was (Author: khogg):
I'm getting WriteLocks never being granted on 3.4.11 and initially brought it up on the user
email list.  I'm working with ZookeeperNetEx on C# but have verified the code behaves the
same on Java.

I've encountered two issues:
 # When setting the watch on the predecessor, its possible that the predecessor has been deleted
between the time we acquired the children and set the watch.  If this happens, their is no
watch and we exit out of the loop.  We should set id to null in this case to ensure the loop
doesn't terminate.
 # Still need the change to ensure the dir name is prepended to the name returned from getChildren
call in findPrefixInChildren.

#1 change



{{ Stat stat = zookeeper.exists(lastChildId, new LockWatcher());}}
{{ if (stat != null) {}}
{{    return Boolean.FALSE;}}
{{ } else {}}
{{   LOG.warn("Could not find the" +}}
{{ " stats for less than me: " + lastChildName.getName());}}
{{ }}}

to

{{ Stat stat = zookeeper.exists(lastChildId, new LockWatcher());}}
{{ if (stat != null) {}}
{{   return Boolean.FALSE;}}
{{ } else {}}
{{   LOG.warn("Could not find the" +}}
{{ " stats for less than me: " + lastChildName.getName());}}

    *id = null*;
{{ }}}

I've been running with these changes on 3.4.11 with two processes contending for 3 different
locks and so far no hangs like I was seeing consistently without them.

> Bug in WriteLock recipe implementation?
> ---------------------------------------
>
>                 Key: ZOOKEEPER-645
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-645
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: recipes
>    Affects Versions: 3.2.2
>         Environment: 3.2.2 java 1.6.0_12
>            Reporter: Jaakko Laine
>            Assignee: Mahadev konar
>            Priority: Minor
>             Fix For: 3.6.0
>
>         Attachments: 645-fix-findPrefixInChildren.patch, ZOOKEEPER-645-compareTo.patch,
ZOOKEEPER-645.3.patch.txt
>
>
> Not sure, but there seem to be two issues in the example WriteLock:
> (1) ZNodeName is sorted according to session ID first, and then according to znode sequence
number. This might cause starvation as lower session IDs always get priority. WriteLock is
not thread-safe in the first place, so having session ID involved in compare operation does
not seem to make sense.
> (2) if findPrefixInChildren finds previous ID, it should add dir in front of the ID



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message