Return-Path: X-Original-To: apmail-curator-dev-archive@minotaur.apache.org Delivered-To: apmail-curator-dev-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F1A0C11183 for ; Fri, 8 Aug 2014 13:06:06 +0000 (UTC) Received: (qmail 41255 invoked by uid 500); 8 Aug 2014 13:06:06 -0000 Delivered-To: apmail-curator-dev-archive@curator.apache.org Received: (qmail 41211 invoked by uid 500); 8 Aug 2014 13:06:06 -0000 Mailing-List: contact dev-help@curator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@curator.apache.org Delivered-To: mailing list dev@curator.apache.org Received: (qmail 41193 invoked by uid 99); 8 Aug 2014 13:06:06 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 08 Aug 2014 13:06:06 +0000 X-ASF-Spam-Status: No, hits=1.5 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of mdrob@cloudera.com designates 209.85.214.172 as permitted sender) Received: from [209.85.214.172] (HELO mail-ob0-f172.google.com) (209.85.214.172) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 08 Aug 2014 13:06:01 +0000 Received: by mail-ob0-f172.google.com with SMTP id wn1so4006277obc.31 for ; Fri, 08 Aug 2014 06:05:40 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:content-type; bh=PIo19plMOLS33OGv4pA/3vImueoQ46i+GvQYrRtZ3AI=; b=fZgo9go39vpD/UQ3TeDIuK81+sOc3dRNNo87glEC5cYopC7Y0Z2Zh+/3Hsvz8zYIEH BnNiaqZK/0s7sfw2RLEXuP8ikeimmsRQPPu3zvpULe6fdmIiPiPMJeRMu9/sliFonA9w KCRCEybbPZafW9c8EDKKV/kcguhKVNQLqKUShdr09P+YQdmNZEcSVEMTnbOuV7Nu0WH7 iWUnO/VB+2fZiZeod7B9I0pFxIb5ZdCSagWguRNH/SmhmQVNAAvscnt1ljWZUpAb1IYb Et6N9++8asww5UO+MIHsTkGiMTpZyyVMQj3k7SMVJDL06LV4ryO4+Bk9DP9UyHsf65W0 Vf5w== X-Gm-Message-State: ALoCoQkx8RVGtOjuikYfqlNlZilsOmSiE1CgEeASjkJXfUjX1aHD58ZbaoPx1PKOdvli2hFSXZmX X-Received: by 10.182.33.99 with SMTP id q3mr30390269obi.28.1407503140375; Fri, 08 Aug 2014 06:05:40 -0700 (PDT) MIME-Version: 1.0 Received: by 10.60.123.12 with HTTP; Fri, 8 Aug 2014 06:05:20 -0700 (PDT) In-Reply-To: References: From: Mike Drob Date: Fri, 8 Aug 2014 08:05:20 -0500 Message-ID: Subject: Re: CURATOR-79 To: dev@curator.apache.org Content-Type: multipart/alternative; boundary=001a11c1f286b3ed6105001ddc4b X-Virus-Checked: Checked by ClamAV on apache.org --001a11c1f286b3ed6105001ddc4b Content-Type: text/plain; charset=UTF-8 Explicitly coding for a possible InterruptedException sounds good to me, but we already know that I prefer (more) diverse Exception types. I'm not sure I understand what the proposed alternative is? Mike On Fri, Aug 8, 2014 at 1:02 AM, Cameron McKenzie wrote: > Guys, > I've been looking into a fix for CURATOR-79 ( > https://issues.apache.org/jira/browse/CURATOR-79) and have found it to be > slightly more complicated than initially expected. > > The locking recipes are using protected zNodes (i.e the zNode name contains > a random UUID that is tied to a particular builder instance) for locks, > which is sensible, but there seems to be an issue with this. > > The protected logic basically looks for the cause of failure on a create, > and if it's connection loss, then it does an ensured deleted on the path it > was trying to create to ensure that it's removed if it did get created. > > For CURATOR-79, and InterruptedException is causing this call to fail when > waiting for the response from ZK. This means that the protected logic does > not fire and we end up with an orphaned node. > > It's possible with some ugliness to handle this in the InterprocesMutex, > but I think that maybe it's better fixed in the protected logic. Maybe the > protected logic could be modified so that it will occur on ConnectionLoss > or on any non-KeeperException (i.e. InterruptedException). This would cause > the zNode to be removed if it was created, and would fix this deadlock > issue. > > I would welcome anyone's opinion on the way forward. > cheers > Cam > --001a11c1f286b3ed6105001ddc4b--