Return-Path: X-Original-To: apmail-zookeeper-user-archive@www.apache.org Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5AFDA1128F for ; Wed, 23 Jul 2014 18:54:41 +0000 (UTC) Received: (qmail 18096 invoked by uid 500); 23 Jul 2014 18:54:40 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 18054 invoked by uid 500); 23 Jul 2014 18:54:40 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 17406 invoked by uid 99); 23 Jul 2014 18:54:40 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 23 Jul 2014 18:54:40 +0000 X-ASF-Spam-Status: No, hits=2.5 required=5.0 tests=FREEMAIL_REPLY,HTML_MESSAGE,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of glahiru@gmail.com designates 209.85.212.181 as permitted sender) Received: from [209.85.212.181] (HELO mail-wi0-f181.google.com) (209.85.212.181) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 23 Jul 2014 18:54:35 +0000 Received: by mail-wi0-f181.google.com with SMTP id bs8so2721743wib.2 for ; Wed, 23 Jul 2014 11:54:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=gEGgaKaTDWgsJT2BLVYhz4kFmd312OCDqcGWNF8dY4M=; b=QusCNv3Q3xhRKVXlM/7oCoXJMiPGBORWfsWJ6y6CXbc5XMUoC0yXVQ9AG+p7uPAvyn lWQjqsQyeu7l1y2rsOI0q9dXrWuigSLM9Sna9nqynLo6EZvSnzIrmALoOiZLJDX9pU/O UUvX7f0YL/mVFbqDqTANuzlq1Zrni34ZrrvTDDQyo+gjs8o21oF4AO8kAjvp6uCTAiC/ IB0EliadcholyP6DALEqCdq8iv0D8taYJgkVkkNFqMcl20o9IxZ/vgojC2EAb8vUOtm1 mzb/0dX4wQIvwXexfrAKSLLrGc/nz89d5mci0SgpeHYQNKj6uZzLlO0jDoTUT/f6I9R0 chIQ== MIME-Version: 1.0 X-Received: by 10.194.90.79 with SMTP id bu15mr4559625wjb.17.1406141651550; Wed, 23 Jul 2014 11:54:11 -0700 (PDT) Received: by 10.216.199.71 with HTTP; Wed, 23 Jul 2014 11:54:11 -0700 (PDT) In-Reply-To: References: Date: Wed, 23 Jul 2014 14:54:11 -0400 Message-ID: Subject: Re: ZK session expiration and recovery From: Lahiru Gunathilake To: "user@zookeeper.apache.org" Cc: Jordan Zimmerman Content-Type: multipart/alternative; boundary=047d7bd91370a5082104fee0dd9b X-Virus-Checked: Checked by ClamAV on apache.org --047d7bd91370a5082104fee0dd9b Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Hi Ahmed, Yes you can, when you create a zk object register a watch and listen to expire event and connect back. Following code might be useful, but I haven't tried this but I have listened to syncConnected event it worked fine. zk =3D new ZooKeeper(zkhostPort, 6000, this); synchronized public void process(WatchedEvent watchedEvent) { synchronized (mutex) { Event.KeeperState state =3D watchedEvent.getState(); switch (state) { case Expired: // do reconnect the zk break; } } } In this case if it expires this will get triggered and reconnect. But when you reconnect remember to handle the SyncConnected too. Lahiru On Wed, Jul 23, 2014 at 2:44 PM, Ahmed H. wrote: > Is there another approach? Not sure adding more components is an option f= or > me right now, but it could be in the future. I looked into it briefly, an= d > it seems like it might work. > > Is there a way for a Zookeeper client to get notified when the connection > drops or when the session expires? > > > On Fri, Jul 18, 2014 at 11:07 AM, Jordan Zimmerman < > jordan@jordanzimmerman.com> wrote: > > > You might consider using Curator (http://curator.apache.org). One of > it=E2=80=99s > > main features is ZooKeeper connection management. > > > > -JZ > > > > > > On July 18, 2014 at 9:59:56 AM, Ahmed H. (ahmed.hammad@gmail.com) wrote= : > > > > Hello, > > > > > > I am having some issues where the Zookeeper connection loss occurs. Thi= s > > affects various things in my application, namely watchers, which result > in > > errors like the one below: > > > > 23:13:01,593 ERROR [org.apache.zookeeper.ClientCnxn] > > (pool-5-thread-1-EventThread) Error while calling watcher : > > org.apache.zookeeper.KeeperException$SessionExpiredException: > > KeeperErrorCode =3D Session expired for /controller/resync > > at org.apache.zookeeper.KeeperException.create(KeeperException.java:118= ) > > [zookeeper-3.3.4.jar:3.3.3-1203054] > > at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) > > [zookeeper-3.3.4.jar:3.3.3-1203054] > > at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1249) > > [zookeeper-3.3.4.jar:3.3.3-1203054] > > at sun.reflect.GeneratedMethodAccessor56.invoke(Unknown Source) > > [:1.7.0_51] > > at > > > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorI= mpl.java:43) > > > > [rt.jar:1.7.0_51] > > at java.lang.reflect.Method.invoke(Method.java:606) [rt.jar:1.7.0_51] > > at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93) > > [clojure-1.5.1.jar:] > > at clojure.lang.Reflector.invokeInstanceMethod(Reflector.java:28) > > [clojure-1.5.1.jar:] > > at zookeeper$children.doInvoke(zookeeper.clj:230) at > > clojure.lang.RestFn.invoke(RestFn.java:464) [clojure-1.5.1.jar:] > > at resync$resync_group_watcher.invoke(resync.clj:26) > > at zookeeper.internal$make_watcher$reify__10446.process(internal.clj:56= ) > > at > > > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:= 531) > > > > [zookeeper-3.3.4.jar:3.3.3-1203054] > > at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:507) > > [zookeeper-3.3.4.jar:3.3.3-1203054] > > > > > > I guess I have a few questions that might help me mitigate this issue. = I > > could try to fix whatever is causing the session expiration. This issue > > occurs when we have a lot of activity on the machine, which leads me to > > believe that it might be caused by GC activity (based on the ZK guide). > > This might work, but it seems to me like we would just be masking the > > issue > > and eventually, it might happen again. > > > > > > The other issue is that our client never recovers. It's completely dead= . > > Is > > there a way to make it auto reconnect after it dies? Does Zookeeper > > support > > such functionality? > > > > > > Are there any other things I should be aware of or any recommendations > you > > have for setting up a Zookeeper environment? For the record, we are > > running > > version 3.4.5 in a single node setup. > > > > Thanks > > > > > --=20 System Analyst Programmer PTI Lab Indiana University --047d7bd91370a5082104fee0dd9b--