Return-Path: Delivered-To: apmail-zookeeper-user-archive@www.apache.org Received: (qmail 37675 invoked from network); 4 Apr 2011 17:48:37 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 4 Apr 2011 17:48:37 -0000 Received: (qmail 80281 invoked by uid 500); 4 Apr 2011 17:48:37 -0000 Delivered-To: apmail-zookeeper-user-archive@zookeeper.apache.org Received: (qmail 80235 invoked by uid 500); 4 Apr 2011 17:48:37 -0000 Mailing-List: contact user-help@zookeeper.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@zookeeper.apache.org Delivered-To: mailing list user@zookeeper.apache.org Received: (qmail 80227 invoked by uid 500); 4 Apr 2011 17:48:37 -0000 Delivered-To: apmail-hadoop-zookeeper-user@hadoop.apache.org Received: (qmail 80224 invoked by uid 99); 4 Apr 2011 17:48:37 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Apr 2011 17:48:36 +0000 X-ASF-Spam-Status: No, hits=-0.7 required=5.0 tests=FREEMAIL_FROM,RCVD_IN_DNSWL_LOW,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (nike.apache.org: domain of saint.ack@gmail.com designates 209.85.212.48 as permitted sender) Received: from [209.85.212.48] (HELO mail-vw0-f48.google.com) (209.85.212.48) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 04 Apr 2011 17:48:29 +0000 Received: by vws7 with SMTP id 7so6690413vws.35 for ; Mon, 04 Apr 2011 10:48:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:sender:date:x-google-sender-auth :message-id:subject:from:to:content-type; bh=V/6XPzPCWrVUz7RmNFhHfdCFPOip3IWjGN6PeMkBvag=; b=IinGR0u4LbckglabH15n4rd3l0FhITielG4qWSDp2VvPlCkshf6IudPUVINIcJCohf ZpgAH5u3fS9+bCbmhc4Z9N3Ef4KaWKVME5CiftMeboPEvU1Mqk3AL7C94HhI3jqV4fpj XPa+sYRs+Pjb6eK6qGl2KtYZITQ5BXSAz9Dgc= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:sender:date:x-google-sender-auth:message-id:subject :from:to:content-type; b=elxi4f/qMoeqeWeaN8LUIfr7kU6dmN6qLMvpGpJ7svOtlv7gZQxQYJAVmByWYY6kM3 ERDNxXQnYSph0ii2wETtMsZPRrg/3ie1NpEWuuZX1akqNbnXwUlyweEGw4gWb0vao/Ll iMnk6DAdF7vK3AptUS6MZ4sRKq9wpFIrI0Y2E= MIME-Version: 1.0 Received: by 10.52.93.237 with SMTP id cx13mr4303274vdb.179.1301939288338; Mon, 04 Apr 2011 10:48:08 -0700 (PDT) Sender: saint.ack@gmail.com Received: by 10.52.188.164 with HTTP; Mon, 4 Apr 2011 10:48:08 -0700 (PDT) Date: Mon, 4 Apr 2011 10:48:08 -0700 X-Google-Sender-Auth: 7H7wnUGp7o4ItDxtCVhhWc18Y28 Message-ID: Subject: Hung up zookeeper client close? From: Stack To: zookeeper-user Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Checked: Checked by ClamAV on apache.org Have you lot ever seen a hang on client close? A user over in hbaselandia is seeing such a thing reportedly (See below -- I'm trying to get more info). I thought I'd come ask the experts for their take. Should we be running close of zk connection inside a timer interrupting it if it gets stuck? Good stuff, St.Ack If I dump the JVM threads in a console, it looks like the region server wants to close all zookeeper connections and blocks until this is done. HRegionServer.java:672 --> HConnectionManager.deleteConnection(conf, true); "regionserver60020-EventThread" daemon prio=10 tid=0x000000005d0dc000 nid=0x7e1d waiting on condition [0x0000000042941000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x0000000781c9ce00> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502) "regionserver60020" prio=10 tid=0x00002aaab023e000 nid=0x7e1b in Object.wait() [0x000000004273f000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(Native Method) - waiting on <0x00000007be9ad330> (a org.apache.zookeeper.ClientCnxn$Packet) at java.lang.Object.wait(Object.java:485) at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1317) - locked <0x00000007be9ad330> (a org.apache.zookeeper.ClientCnxn$Packet) at org.apache.zookeeper.ClientCnxn.close(ClientCnxn.java:1295) at org.apache.zookeeper.ZooKeeper.close(ZooKeeper.java:531) - locked <0x0000000781fae170> (a org.apache.zookeeper.ZooKeeper) at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.close(ZooKeeperWatcher.java:399) at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.close(HConnectionManager.java:1050) at org.apache.hadoop.hbase.client.HConnectionManager.deleteConnection(HConnectionManager.java:175) - locked <0x00000007801b6800> (a org.apache.hadoop.hbase.client.HConnectionManager$1) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:672) at java.lang.Thread.run(Thread.java:662) "main-EventThread" daemon prio=10 tid=0x00002aaab0540000 nid=0x7e10 waiting on condition [0x0000000042139000] java.lang.Thread.State: WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x0000000781faf3e0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:158) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1987) at java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:399) at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:502) Unfortunately, some connections are never closed so the server does not shut down.