curator-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Orcun Simsek (JIRA)" <j...@apache.org>
Subject [jira] [Comment Edited] (CURATOR-79) InterProcessMutex doesn't clean up after interrupt
Date Tue, 05 Aug 2014 19:02:11 GMT

    [ https://issues.apache.org/jira/browse/CURATOR-79?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13833414#comment-13833414
] 

Orcun Simsek edited comment on CURATOR-79 at 8/5/14 7:01 PM:
-------------------------------------------------------------

Also adding a test that fails. (slight modification of the test attached in the original thread)
{code:title=Test.java|borderStyle=solid}
private static final int SESSION_TIMEOUT_MS = 180 * 1000;
    private static final int CONNECTION_TIMEOUT_MS = 16 * 1000;

    private static final int BASE_SLEEP_TIME_MS = 1000;
    private static final int MAX_SLEEP_TIME_MS = 16 * 1000;
    private static final int MAX_RETRIES = 10;

    @Test
    public void testInterruptDeadlock() throws Exception {
        CuratorFramework client = createClientWithNamespace("testCluster", "127.0.0.1:2181");
        client.start();

        final InterProcessMutex lock = new InterProcessMutex(client, "/testInterruption");
        try {

            try {
                lock.acquire();
                Thread.currentThread().interrupt();
                lock.release();
            } catch (InterruptedException e) {
                Thread.currentThread().interrupt();
                if (lock.isAcquiredInThisProcess()) {
                    lock.release();
                }
            }
            assertTrue(lock.acquire(10, TimeUnit.MILLISECONDS));
        } finally {
            if (lock.isAcquiredInThisProcess()) {
                lock.release();
            }
        }
    }

    private static CuratorFramework createClientWithNamespace(String clusterName, String connectString)
{
        RetryPolicy retryPolicy = new BoundedExponentialBackoffRetry(BASE_SLEEP_TIME_MS, MAX_SLEEP_TIME_MS,
MAX_RETRIES);
        return CuratorFrameworkFactory.builder()
            .sessionTimeoutMs(SESSION_TIMEOUT_MS)
            .connectionTimeoutMs(CONNECTION_TIMEOUT_MS)
            .namespace(clusterName)
            .retryPolicy(retryPolicy)
            .connectString(connectString)
            .build();
    }
{code}



was (Author: ortschun):
Also adding a test that fails. (slight modification of the test attached in the original thread)
{code:title=Test.java|borderStyle=solid}
@Test
    public void testInterruptDeadlock() throws Exception {
        CuratorFramework client = CuratorFrameworkFactory.builder()
            .connectString("127.0.0.1:2181")
            .retryPolicy(new RetryNTimes(10, 1000))
            .build();
        client.start();

        Thread.currentThread().interrupt();
        final InterProcessMutex lock = new InterProcessMutex(client, "/testInterruption4");
        try {
            try {
                lock.acquire();
                lock.release();
            } catch (InterruptedException e) {
                if (lock.isAcquiredInThisProcess()) {
                    lock.release();
                }
            }
            assertTrue(lock.acquire(10, TimeUnit.MILLISECONDS));
        } finally {
            if (lock.isAcquiredInThisProcess()) {
                System.out.println("Lock released successfully.");
                lock.release();
            }
        }
    }
{code}


> InterProcessMutex doesn't clean up after interrupt
> --------------------------------------------------
>
>                 Key: CURATOR-79
>                 URL: https://issues.apache.org/jira/browse/CURATOR-79
>             Project: Apache Curator
>          Issue Type: Bug
>    Affects Versions: 2.0.0-incubating, 2.1.0-incubating, 2.2.0-incubating, 2.3.0
>            Reporter: Orcun Simsek
>            Assignee: Jordan Zimmerman
>
> InterProcessMutex can deadlock if a thread is interrupted during acquire().  Specifically,
CreateBuilderImpl.pathInForeground submits a create request to ZooKeeper, and an InterruptedException
is thrown after the node is created in ZK but before ZK.create returns. ZK.create propagates
a non-KeeperException, so Curator assumes the create has failed, but does not retry, and the
node is now orphaned. At some point in the future, the node becomes the next in the acquisition
sequence, but is not reclaimed as the ZK session has not expired.
> <stack trace attached in comments below>
> Curator should catch the InterruptedException and other non-KeeperExceptions, and delete
the created node before propagating these exceptions.
> (as originally discussed on https://groups.google.com/forum/#!topic/curator-users/9ii5of8SbdQ)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message