hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Joey Echeverria (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-1316) ZooKeeper: use native threads to avoid GC stalls (JNI integration)
Date Fri, 03 Jul 2009 08:47:47 GMT

    [ https://issues.apache.org/jira/browse/HBASE-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12726832#action_12726832
] 

Joey Echeverria commented on HBASE-1316:
----------------------------------------

We specifically avoided having any callbacks cross the C/Java boundary This was simple in
our use case where the only thing we needed to monitor after creating an ephemeral node was
whether or ZK session had expired. We also had a very simple recovery mechanism, we immediately
kill the process that got disconnected and the shell script that launched us will relaunch.
This proved far easier than trying to re-establish a connection to ZK in the running process.

> ZooKeeper: use native threads to avoid GC stalls (JNI integration)
> ------------------------------------------------------------------
>
>                 Key: HBASE-1316
>                 URL: https://issues.apache.org/jira/browse/HBASE-1316
>             Project: Hadoop HBase
>          Issue Type: Improvement
>    Affects Versions: 0.20.0
>            Reporter: Andrew Purtell
>            Assignee: Nitay Joffe
>         Attachments: zk_wrapper.tar.gz
>
>
> From Joey Echeverria up on hbase-users@:
> We've used zookeeper in a write-heavy project we've been working on and experienced issues
similar to what you described. After several days of debugging, we discovered that our issue
was garbage collection. There was no way to guarantee we wouldn't have long pauses especially
since our environment was the worst case for garbage collection, millions of tiny, short lived
objects. I suspect HBase sees similar work loads frequently, if it's not constantly. With
anything shorter than a 30 second session time out, we got session expiration events extremely
frequently. We needed to use 60 seconds for any real confidence that an ephemeral node disappearing
meant something was unavailable.
> We really wanted quick recovery so we ended up writing a light-weight wrapper around
the C API and used swig to auto-generate a JNI interface. It's not perfect, but since we switched
to this method we've never seen a session expiration event and ephemeral nodes only disappear
when there are network issues or a machine/process goes down.
> I don't know if it's worth doing the same kind of thing for HBase as it adds some "unnecessary"
native code, but it's a solution that I found works.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message