hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] Commented: (HBASE-1316) ZooKeeper: use native threads to avoid GC stalls (JNI integration)
Date Thu, 29 Jul 2010 17:20:19 GMT

    [ https://issues.apache.org/jira/browse/HBASE-1316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12893714#action_12893714

Todd Lipcon commented on HBASE-1316:

phunt, jgray, and I talked about this on IRC this morning for a little while. We sketched
out a design that looks something like this:

1) We tune up the ZK session timeout for region servers to be higher than longest expected
GC pause (eg 5 minutes)
2) We add a *second* ZK session on the same machine - either this is a second JVM running
next to the first, or it's a JNI thread. Either way, it's its own session with its own ephemeral
node - eg /rs-watchdogs/<regionserver name>. This second session has a tuned *down*
session timeout (eg 5 seconds)
3) In the HMaster, we watch /rs-watchdogs/*, and if we notice one of the ephemeral nodes disappears,
then we have to forcibly expire the matching regionserver ZK session. We will need some ZK
support here to add the ability to expire someone else's session in a reliable manner.

This has the following effects:
A) If there's a long garbage collection pause in the JVM, the "fast" ZK session stays up,
and so long as the GC pause is under the "long" timeout, nothing will expire. This is good.
B) If there's a network or machine outage, the "fast" ZK session goes down, in which case
we detect the outage quickly. This is also good.
C) By adding the forcible expiration of the RS ZK session when the "fast" session expires,
we keep the same fencing guarantees as we've got now.

The other nice thing about this design is that it doesn't change the current RS or master
at all - the master still watches the normal RS znodes, it's just that we have a second system
that's doing a fast-path expiration on them when a machine goes down. We could also choose
to implement this second system based on other kinds of machine health checks, etc.

> ZooKeeper: use native threads to avoid GC stalls (JNI integration)
> ------------------------------------------------------------------
>                 Key: HBASE-1316
>                 URL: https://issues.apache.org/jira/browse/HBASE-1316
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.20.0
>            Reporter: Andrew Purtell
>            Assignee: Berk D. Demir
>         Attachments: zk_wrapper.tar.gz
> From Joey Echeverria up on hbase-users@:
> We've used zookeeper in a write-heavy project we've been working on and experienced issues
similar to what you described. After several days of debugging, we discovered that our issue
was garbage collection. There was no way to guarantee we wouldn't have long pauses especially
since our environment was the worst case for garbage collection, millions of tiny, short lived
objects. I suspect HBase sees similar work loads frequently, if it's not constantly. With
anything shorter than a 30 second session time out, we got session expiration events extremely
frequently. We needed to use 60 seconds for any real confidence that an ephemeral node disappearing
meant something was unavailable.
> We really wanted quick recovery so we ended up writing a light-weight wrapper around
the C API and used swig to auto-generate a JNI interface. It's not perfect, but since we switched
to this method we've never seen a session expiration event and ephemeral nodes only disappear
when there are network issues or a machine/process goes down.
> I don't know if it's worth doing the same kind of thing for HBase as it adds some "unnecessary"
native code, but it's a solution that I found works.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message