Return-Path: X-Original-To: apmail-curator-user-archive@minotaur.apache.org Delivered-To: apmail-curator-user-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 6D98B119B5 for ; Fri, 1 Aug 2014 13:51:22 +0000 (UTC) Received: (qmail 6840 invoked by uid 500); 1 Aug 2014 13:51:22 -0000 Delivered-To: apmail-curator-user-archive@curator.apache.org Received: (qmail 6797 invoked by uid 500); 1 Aug 2014 13:51:22 -0000 Mailing-List: contact user-help@curator.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@curator.apache.org Delivered-To: mailing list user@curator.apache.org Received: (qmail 6787 invoked by uid 99); 1 Aug 2014 13:51:22 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Aug 2014 13:51:22 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of gabriel.reid@gmail.com designates 74.125.82.176 as permitted sender) Received: from [74.125.82.176] (HELO mail-we0-f176.google.com) (74.125.82.176) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Aug 2014 13:51:16 +0000 Received: by mail-we0-f176.google.com with SMTP id q58so4376736wes.35 for ; Fri, 01 Aug 2014 06:50:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:date:message-id:subject:from:to:content-type; bh=e/eVG56kYGW6qXo3PMO1okqMvYtWbBLHfYB2fRDnhw0=; b=h1gVVo1WVMO3juedyVFi+DOCICOoiKSwRn7/kd9l/Qt6ab9mIzq6Or9R8Z9fDby6Bk EON+meXazwL75dWegodiEbjThjorIDTlUvFzviluf98/NxFIvU9q9C+6b5YfFcKf9cqX zcbKijFzqGM2IyMsCXQNSxzyjFQKhe0iNpRtG2y5Mh7aMsAYwaWHftZ3cBdtoaox/1Ho /Ede+SN9mbg3tsVW1fOC1mvOIhCs/qDNQhMDF2MhUUsijr3vr94TXxMQrVBdGv5sHIru LGWhS7f7u0yDDtxyn8AN5KV8EnfY5iv3ZAKvz45dgxoglmTQ70pH10zTQqPwbkCBZP1l B0jA== MIME-Version: 1.0 X-Received: by 10.194.238.6 with SMTP id vg6mr8064273wjc.24.1406901054397; Fri, 01 Aug 2014 06:50:54 -0700 (PDT) Received: by 10.194.15.197 with HTTP; Fri, 1 Aug 2014 06:50:54 -0700 (PDT) Date: Fri, 1 Aug 2014 15:50:54 +0200 Message-ID: Subject: Stack trace when using a PathChildrenCache or NodeCache for a short time From: Gabriel Reid To: user@curator.apache.org Content-Type: text/plain; charset=UTF-8 X-Virus-Checked: Checked by ClamAV on apache.org Hi, I'm running into an unexpected issue when starting a PathChildrenCache, reading from it, and then closing it right away. This is a pattern that we're using in a few places in our code base in order to share code between long-running server processes and cli tools. Basically, the code that is causing the issue looks like this: curatorFramework = CuratorFrameworkFactory.newClient(...); curatorFramework.start(); PathChildrenCache pathChildrenCache = new PathChildrenCache(curatorFramework, "/myrootnode", true); pathChildrenCache.start( PathChildrenCache.StartMode.BUILD_INITIAL_CACHE); pathChildrenCache.getCurrentData(); pathChildrenCache.close(); curatorFramework.close(); Running the above code works, but it typically also ends with the following stack trace being logged: ERROR org.apache.curator.framework.imps.CuratorFrameworkImpl Background exception was not retry-able or retry gave up [main-EventThread] java.lang.IllegalStateException: instance must be started before calling this method at com.google.common.base.Preconditions.checkState(Preconditions.java:176) at org.apache.curator.framework.imps.CuratorFrameworkImpl.getData(CuratorFrameworkImpl.java:373) at org.apache.curator.framework.recipes.cache.PathChildrenCache.getDataAndStat(PathChildrenCache.java:547) at org.apache.curator.framework.recipes.cache.PathChildrenCache.processChildren(PathChildrenCache.java:670) at org.apache.curator.framework.recipes.cache.PathChildrenCache.access$100(PathChildrenCache.java:68) at org.apache.curator.framework.recipes.cache.PathChildrenCache$4.processResult(PathChildrenCache.java:492) at org.apache.curator.framework.imps.CuratorFrameworkImpl.sendToBackgroundCallback(CuratorFrameworkImpl.java:734) at org.apache.curator.framework.imps.CuratorFrameworkImpl.processBackgroundOperation(CuratorFrameworkImpl.java:515) at org.apache.curator.framework.imps.GetChildrenBuilderImpl$2.processResult(GetChildrenBuilderImpl.java:166) at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:590) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498) Tracing through the code, this appears to be due to a race condition with an async read response coming back through ZK and only being processed after the underlying ZK client has been closed. This same kind of issue occurs with both PathChildrenCache and NodeCache. I realize that this is kind of non-standard use of a cache. However, I'm wondering if there is a way of working with PathNodeCache in a different way to prevent this situation (other than using a Thread.sleep before calling the close methods). Otherwise, I'd be happy to attempt to put together a patch for this if someone can point me in the right direction. Thanks, Gabriel