Return-Path: X-Original-To: apmail-hbase-dev-archive@www.apache.org Delivered-To: apmail-hbase-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 1C0C31005C for ; Mon, 12 May 2014 03:17:57 +0000 (UTC) Received: (qmail 17047 invoked by uid 500); 12 May 2014 00:31:16 -0000 Delivered-To: apmail-hbase-dev-archive@hbase.apache.org Received: (qmail 16866 invoked by uid 500); 12 May 2014 00:31:16 -0000 Mailing-List: contact dev-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@hbase.apache.org Delivered-To: mailing list dev@hbase.apache.org Received: (qmail 16652 invoked by uid 99); 12 May 2014 00:31:16 -0000 Received: from Unknown (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 12 May 2014 00:31:16 +0000 Date: Mon, 12 May 2014 00:31:16 +0000 (UTC) From: "Andrew Purtell (JIRA)" To: dev@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Resolved] (HBASE-1316) ZooKeeper: use native threads to avoid GC stalls (JNI integration) MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell resolved HBASE-1316. ----------------------------------- Resolution: Later Assignee: (was: Joey Echeverria) Not going to happen > ZooKeeper: use native threads to avoid GC stalls (JNI integration) > ------------------------------------------------------------------ > > Key: HBASE-1316 > URL: https://issues.apache.org/jira/browse/HBASE-1316 > Project: HBase > Issue Type: Improvement > Affects Versions: 0.20.0 > Reporter: Andrew Purtell > Attachments: HBASE-1316-1.patch, HBASE-1316-2.patch, zk_wrapper.tar.gz, zookeeper-native-Linux-amd64-64.tgz, zookeeper-native-headers.tgz > > > From Joey Echeverria up on hbase-users@: > We've used zookeeper in a write-heavy project we've been working on and experienced issues similar to what you described. After several days of debugging, we discovered that our issue was garbage collection. There was no way to guarantee we wouldn't have long pauses especially since our environment was the worst case for garbage collection, millions of tiny, short lived objects. I suspect HBase sees similar work loads frequently, if it's not constantly. With anything shorter than a 30 second session time out, we got session expiration events extremely frequently. We needed to use 60 seconds for any real confidence that an ephemeral node disappearing meant something was unavailable. > We really wanted quick recovery so we ended up writing a light-weight wrapper around the C API and used swig to auto-generate a JNI interface. It's not perfect, but since we switched to this method we've never seen a session expiration event and ephemeral nodes only disappear when there are network issues or a machine/process goes down. > I don't know if it's worth doing the same kind of thing for HBase as it adds some "unnecessary" native code, but it's a solution that I found works. -- This message was sent by Atlassian JIRA (v6.2#6252)