Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 8C5AE108F4 for ; Sat, 17 Jan 2015 01:14:34 +0000 (UTC) Received: (qmail 35459 invoked by uid 500); 17 Jan 2015 01:14:36 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 35420 invoked by uid 500); 17 Jan 2015 01:14:36 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 35404 invoked by uid 99); 17 Jan 2015 01:14:35 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 17 Jan 2015 01:14:35 +0000 X-ASF-Spam-Status: No, hits=-2.8 required=5.0 tests=HTML_MESSAGE,RCVD_IN_DNSWL_HI,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of barlock@us.ibm.com designates 32.97.110.151 as permitted sender) Received: from [32.97.110.151] (HELO e33.co.us.ibm.com) (32.97.110.151) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 17 Jan 2015 01:14:06 +0000 Received: from /spool/local by e33.co.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Fri, 16 Jan 2015 18:14:04 -0700 Received: from d03dlp02.boulder.ibm.com (9.17.202.178) by e33.co.us.ibm.com (192.168.1.133) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Fri, 16 Jan 2015 18:14:01 -0700 Received: from b03cxnp07028.gho.boulder.ibm.com (b03cxnp07028.gho.boulder.ibm.com [9.17.130.15]) by d03dlp02.boulder.ibm.com (Postfix) with ESMTP id D89853E4003B for ; Fri, 16 Jan 2015 18:08:33 -0700 (MST) Received: from d03av02.boulder.ibm.com (d03av02.boulder.ibm.com [9.17.195.168]) by b03cxnp07028.gho.boulder.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id t0H1CkUV29884428 for ; Fri, 16 Jan 2015 18:12:46 -0700 Received: from d03av02.boulder.ibm.com (localhost [127.0.0.1]) by d03av02.boulder.ibm.com (8.14.4/8.14.4/NCO v10.0 AVout) with ESMTP id t0H1E1d6013225 for ; Fri, 16 Jan 2015 18:14:01 -0700 Received: from d03nm119.boulder.ibm.com (d03nm119.boulder.ibm.com [9.63.40.225]) by d03av02.boulder.ibm.com (8.14.4/8.14.4/NCO v10.0 AVin) with ESMTP id t0H1E0ti013218 for ; Fri, 16 Jan 2015 18:14:00 -0700 In-Reply-To: References: <201501161016340236671@163.com> To: user@zookeeper.apache.org MIME-Version: 1.0 Subject: Re: ConnectionLossException X-KeepSent: 1AA1C6B8:DC3951DF-85257DD0:00065701; type=4; name=$KeepSent X-Mailer: IBM Notes Release 9.0.1FP2 SHF37 August 25, 2014 From: Chris Barlock Message-ID: Date: Fri, 16 Jan 2015 20:13:58 -0500 X-MIMETrack: Serialize by Router on D03NM119/03/M/IBM(Release 9.0.1FP1|April 03, 2014) at 01/16/2015 18:14:00, Serialize complete at 01/16/2015 18:14:00 Content-Type: multipart/alternative; boundary="=_alternative 0006C6EA85257DD0_=" X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 15011701-0009-0000-0000-0000080D5BC5 X-Virus-Checked: Checked by ClamAV on apache.org --=_alternative 0006C6EA85257DD0_= Content-Type: text/plain; charset="US-ASCII" I implemented a retry of every ZK method call I make and the exceptions stopped. I recognize that we are currently using an older ZK version (3.3.4) -- but is the current version better in this respect? I got the email today about the new release of Curator. I wonder why this is not part of the ZK offering if it makes ZK that much easier to use. (I don't know that this is a true statement, but the Curator doc claims so.) Chris IBM Tivoli Systems Research Triangle Park, NC (919) 224-2240 Internet: barlock@us.ibm.com From: Chris Barlock/Raleigh/IBM@IBMUS To: user@zookeeper.apache.org Date: 01/15/2015 09:37 PM Subject: Re: ConnectionLossException My ensemble is a single ZK node running on the same computer as the rest of my application. I think it is a "good state" because my configuration data does get loaded into ZK. Does ZK create the ConnectionLossException if the session timed out and I tried to use it after this happened? If so, it would be related to the timeout. If not, what causes ConnectionLossException? Chris IBM Tivoli Systems Research Triangle Park, NC (919) 224-2240 Internet: barlock@us.ibm.com From: "bit1129@163.com" To: "user@zookeeper.apache.org" Date: 01/15/2015 09:18 PM Subject: Re: ConnectionLossException Connection loss is not related with the session timeout. If it frequently ocurrs, then it indicates that the ensemble of zookeeper are not in a good state. bit1129@163.com From: Chris Barlock Date: 2015-01-16 10:04 To: user Subject: ConnectionLossException We are currently using ZK 3.3.4, which is included in the version of Kafka we are using. I'm seeing a number of exceptions like: org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = ConnectionLoss for /com at org.apache.zookeeper.KeeperException.create(KeeperException.java:90) at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:815) at org.apache.zookeeper.ZooKeeper.exists(ZooKeeper.java:843) at com.ibm.tivoli.ccm.config.rest.ConfigClient.setValueAtNode(ConfigClient.java:630) My method setValueAtNode includes a call to this method before I make any zk (ZooKeeper) calls: private void connectZooKeeper() { final String methodName = "connectZooKeeper"; trace.entry(CLASS_NAME, methodName); if (zk == null || zk.getState() != States.CONNECTED) { if (zk != null) { close(); } try { zk = new ZooKeeper(connectString, sessionTimeout, this); int connectAttempts = 0; while (zk.getState() != States.CONNECTED && connectAttempts < MAX_ZK_CONNECT_ATTEMPTS) { try { Thread.sleep(ZK_CONNECT_WAIT); } catch (InterruptedException e) { // Ignore } connectAttempts++; } } catch (IOException e) { trace.exception(CLASS_NAME, methodName, e); } } trace.exit(CLASS_NAME, methodName); } I'm totally guessing that the connection is timing out between the time this method is called and when I make the following zk method calls. Is there a best practise for ensuring one is connected to ZooKeeper? My session timeout is 3000 ms. Chris --=_alternative 0006C6EA85257DD0_=--