curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (CURATOR-79) InterProcessMutex doesn't clean up after interrupt
Date Mon, 11 Aug 2014 13:59:11 GMT

    [ https://issues.apache.org/jira/browse/CURATOR-79?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14092777#comment-14092777
] 

ASF GitHub Bot commented on CURATOR-79:
---------------------------------------

Github user madrob commented on a diff in the pull request:

    https://github.com/apache/curator/pull/35#discussion_r16053962
  
    --- Diff: curator-recipes/src/test/java/org/apache/curator/framework/recipes/locks/TestInterProcessMutex.java
---
    @@ -107,4 +113,70 @@ public Void call() throws Exception
                 client.close();
             }
         }
    +    
    +    /**
    +     * See CURATOR-79. If the mutex is interrupted while attempting to acquire a lock
it is
    +     * possible for the zNode to be created in ZooKeeper, but for Curator to think that
it
    +     * hasn't been. This causes the next call to acquire() to fail because the an orphaned
    +     * zNode has been left behind from the previous call.
    +     */
    +    @Test
    +    public void testInterruptedDuringAcquire() throws Exception
    +    {
    +        Timing timing = new Timing();
    +        final CuratorFramework        client = CuratorFrameworkFactory.newClient(server.getConnectString(),
new RetryOneTime(1));
    +        client.start();
    +        final InterProcessMutex       lock = new InterProcessMutex(client, LOCK_PATH);
    +        
    +        final AtomicBoolean interruptOnError = new AtomicBoolean(true);
    +        
    +        ((CuratorFrameworkImpl)client).debugUnhandledErrorListener = new UnhandledErrorListener()
    +        {
    +            
    +            @Override
    +            public void unhandledError(String message, Throwable e)
    +            {
    +                if(interruptOnError.compareAndSet(true, false))
    +                {
    +                    Thread.currentThread().interrupt();
    +                }
    +            }
    +        };
    +        
    +        //The lock path needs to exist for the deadlock to occur.
    +        try {
    +            client.create().creatingParentsIfNeeded().forPath(LOCK_PATH);
    +        } catch(NodeExistsException e) {            
    +        }
    +        
    +        try
    +        {
    +            //Interrupt the current thread. This will cause ensurePath() to fail.
    +            //We need to reinterrupt in the debugUnhandledErrorListener above.
    +            Thread.currentThread().interrupt();
    +            lock.acquire();
    +            Assert.fail();
    +        }
    +        catch(InterruptedException e)
    +        {
    +            //Expected lock to have failed.
    +            Assert.assertTrue(!lock.isOwnedByCurrentThread());
    --- End diff --
    
    nit: assertFalse


> InterProcessMutex doesn't clean up after interrupt
> --------------------------------------------------
>
>                 Key: CURATOR-79
>                 URL: https://issues.apache.org/jira/browse/CURATOR-79
>             Project: Apache Curator
>          Issue Type: Bug
>    Affects Versions: 2.0.0-incubating, 2.1.0-incubating, 2.2.0-incubating, 2.3.0
>            Reporter: Orcun Simsek
>            Assignee: Jordan Zimmerman
>
> InterProcessMutex can deadlock if a thread is interrupted during acquire().  Specifically,
CreateBuilderImpl.pathInForeground submits a create request to ZooKeeper, and an InterruptedException
is thrown after the node is created in ZK but before ZK.create returns. ZK.create propagates
a non-KeeperException, so Curator assumes the create has failed, but does not retry, and the
node is now orphaned. At some point in the future, the node becomes the next in the acquisition
sequence, but is not reclaimed as the ZK session has not expired.
> <stack trace attached in comments below>
> Curator should catch the InterruptedException and other non-KeeperExceptions, and delete
the created node before propagating these exceptions.
> (as originally discussed on https://groups.google.com/forum/#!topic/curator-users/9ii5of8SbdQ)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message