hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jiraposter@reviews.apache.org (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-3759) Eliminate use of ThreadLocals for CoprocessorEnvironment bypass() and complete()
Date Wed, 13 Apr 2011 04:00:12 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3759?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13019185#comment-13019185

jiraposter@reviews.apache.org commented on HBASE-3759:

bq.  On 2011-04-13 03:23:28, Michael Stack wrote:
bq.  > +1 Looks good Gary.  Agree do it now before 0.92.  By chance did you see if it made
a difference profiling?

Yeah if you grab https://issues.apache.org/jira/secure/attachment/12475852/cp_bypass.tar.gz
you can see the call trees I grabbed from profiling with the context object (Call_Tree_context_xxx)
vs. ThreadLocals (Call_Tree_tl_xxx).  If you look at Call_Tree_tl_run.html vs. Call_Tree_context_run.html,
you'll see ~20% of the runnable thread time spent in ThreadLocal.get() (under shouldComplete()
and shouldBypass()).  This is completely eliminated in the context version, though with small
overhead for the object instatiation -- 0.4% in CallContext.createAndPrepare().  (This was
before a rename of CallContext -> ObserverContext).

Granted this is only runnable thread time, so it's skewed in terms of overall impact.  But
at the macro level, the MR put-based import that generated this ran in ~2h15m with the ThreadLocal
version, but only ~1h40m with the context version.  So seems a pretty substantial improvement.

- Gary

This is an automatically generated e-mail. To reply, visit:

On 2011-04-13 01:08:50, Gary Helmling wrote:
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/588/
bq.  -----------------------------------------------------------
bq.  (Updated 2011-04-13 01:08:50)
bq.  Review request for hbase.
bq.  Summary
bq.  -------
bq.  Profiling the HRegionServer process with a RegionObserver coprocessor loaded shows a
fair amount of runnable thread CPU time spent getting the bypass and complete flag ThreadLocal
values by RegionCoprocessorHost.  See the HBASE-3759 JIRA for some attached graphs.
bq.  With the caveat that this is runnable CPU time and not threads in all states, this still
seems like a significant processing bottleneck on a hot call path.  The workload profiled
was a put-based bulk load, so for each multi-put request, RegionCoprocessorHost.prePut() could
be called many times.
bq.  Instead of using ThreadLocal variable for bypass/complete, which will incur contention
on the underlying map of values, I think we can eliminate the bottleneck by using locally
scoped variables for each preXXX/putXXX method called in the RegionCoprocessorHost, MasterCoprocessorHost
and WALCoprocessorHost classes.
bq.  The attached patch refactors the current RegionObserver, MasterObserver and WALObserver
APIs to provide a locally scoped ObserverContext object for storing and checking the bypass
and complete values.
bq.  Summary of changes:
bq.  * adds a new ObserverContext<T extends CoprocessorEnvironment> class, containing
references for bypass, complete and the environment instance
bq.  * in each pre/post method in RegionObserver, the RegionCoprocessorEnvironment parameter
is replaced by ObserverContext<RegionCoprocessorEnvironment>
bq.  * in each pre/post method in MasterObserver, the MasterCoprocessorEnvironment parameter
is replaced by ObserverContext<MasterCoprocessorEnvironment>
bq.  * in each pre/post method in WALObserver, the WALCoprocessorEnvironment parameter is
replace by ObserverContext<WALCoprocesorEnvironment>
bq.  This is obviously a large bulk change to the existing API.  I could avoid the API change
with hacky modification underneath the *CoprocessorEnvironment interfaces.  But since we do
not yet have a public release with coprocessors, I would prefer to take the time to make the
initial API the best we can before we push it out.
bq.  Please let me know your thoughts on this approach.
bq.  This addresses bug HBASE-3759.
bq.      https://issues.apache.org/jira/browse/HBASE-3759
bq.  Diffs
bq.  -----
bq.    src/main/java/org/apache/hadoop/hbase/coprocessor/BaseMasterObserver.java 9576c48 
bq.    src/main/java/org/apache/hadoop/hbase/coprocessor/BaseRegionObserverCoprocessor.java
bq.    src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorEnvironment.java d45b950

bq.    src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java a82f62b 
bq.    src/main/java/org/apache/hadoop/hbase/coprocessor/MasterObserver.java db0870b 
bq.    src/main/java/org/apache/hadoop/hbase/coprocessor/ObserverContext.java PRE-CREATION

bq.    src/main/java/org/apache/hadoop/hbase/coprocessor/RegionObserver.java 3501958 
bq.    src/main/java/org/apache/hadoop/hbase/coprocessor/WALObserver.java 7a34d18 
bq.    src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java 019bbde 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java 60efa12

bq.    src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALCoprocessorHost.java a3f3b31

bq.    src/test/java/org/apache/hadoop/hbase/coprocessor/SampleRegionWALObserver.java 834283f

bq.    src/test/java/org/apache/hadoop/hbase/coprocessor/SimpleRegionObserver.java 0ce2147

bq.    src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java 0db5001

bq.    src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterObserver.java a15d53a 
bq.    src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverStacking.java 2c1e4a0

bq.  Diff: https://reviews.apache.org/r/588/diff
bq.  Testing
bq.  -------
bq.  Thanks,
bq.  Gary

> Eliminate use of ThreadLocals for CoprocessorEnvironment bypass() and complete()
> --------------------------------------------------------------------------------
>                 Key: HBASE-3759
>                 URL: https://issues.apache.org/jira/browse/HBASE-3759
>             Project: HBase
>          Issue Type: Improvement
>          Components: coprocessors
>            Reporter: Gary Helmling
>            Assignee: Gary Helmling
>         Attachments: cp_bypass.tar.gz
> In the current coprocessor framework, ThreadLocal objects are used for the bypass and
complete booleans in CoprocessorEnvironment.  This allows the *CoprocessorHost implementations
to identify when to short-circuit processing the the preXXX and postXXX hook methods.
> Profiling the region server, however, shows that these ThreadLocals can become a contention
point when on a hot code path (such as prePut()).  We should refactor the CoprocessorHost
pre/post implementations to remove usage of the ThreadLocal variables and replace them with
locally scoped variables to eliminate contention between handler threads.

This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message