Return-Path: X-Original-To: apmail-curator-dev-archive@minotaur.apache.org Delivered-To: apmail-curator-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 0C911105D4 for ; Mon, 7 Oct 2013 16:03:10 +0000 (UTC) Received: (qmail 6149 invoked by uid 500); 7 Oct 2013 16:03:09 -0000 Delivered-To: apmail-curator-dev-archive@curator.apache.org Received: (qmail 6059 invoked by uid 500); 7 Oct 2013 16:03:09 -0000 Mailing-List: contact dev-help@curator.incubator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@curator.incubator.apache.org Delivered-To: mailing list dev@curator.incubator.apache.org Received: (qmail 6043 invoked by uid 99); 7 Oct 2013 16:03:08 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 Oct 2013 16:03:08 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.3] (HELO mail.apache.org) (140.211.11.3) by apache.org (qpsmtpd/0.29) with SMTP; Mon, 07 Oct 2013 16:03:06 +0000 Received: (qmail 4989 invoked by uid 99); 7 Oct 2013 16:02:44 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 07 Oct 2013 16:02:44 +0000 Date: Mon, 7 Oct 2013 16:02:44 +0000 (UTC) From: "Jordan Zimmerman (JIRA)" To: dev@curator.incubator.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (CURATOR-62) Leader Election Deadlock MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/CURATOR-62?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jordan Zimmerman updated CURATOR-62: ------------------------------------ Component/s: Recipes > Leader Election Deadlock > ------------------------ > > Key: CURATOR-62 > URL: https://issues.apache.org/jira/browse/CURATOR-62 > Project: Apache Curator > Issue Type: Bug > Components: Recipes > Affects Versions: 2.2.0-incubating > Reporter: Doug Jones > Assignee: Jordan Zimmerman > > I've noticed that it is possible for a leader election to deadlock if a thread is interrupted while it is trying to acquire the mutex for the election. > I've created a forced example of this here: https://github.com/dfjones/curator/commit/544220b1e6b51c2718a7d3511a74962ff1c5ff48 > You can see deadlock by using my modified code and running the LeaderSelectorExample. Some leaders may execute, but on my system I eventually see deadlock. Note that I only see deadlock when running against a remote zk server rather than the embedded test server. I'm using Zookeeper 3.4.5 on Mac OS X 10.8.4. > From what I can tell by inspecting the ZK state/watching in the debugger, the thread that is interrupted is able to successfully create the lock object in ZK. However, due to the interrupt an exception is generated and LockInternals#internalLockLoop never runs. Later, in LeaderSelector#doWork when mutex.release() is called this fails at the for lockData. > Once this occurs, the lock object in ZK is the oldest and will cause deadlock. -- This message was sent by Atlassian JIRA (v6.1#6144)