hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jiraposter@reviews.apache.org (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4192) Optimize HLog for Throughput Using Delayed RPCs
Date Thu, 11 Aug 2011 20:12:29 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13083461#comment-13083461
] 

jiraposter@reviews.apache.org commented on HBASE-4192:
------------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1463/#review1410
-----------------------------------------------------------


Great stuff!  Mostly documentation stuff at this point but I think HLog will be much clearer
by the end of this.


src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
<https://reviews.apache.org/r/1463/#comment3221>

    isn't there an HRegion.createHRegion that doesn't take the rpc server and handles passing
the null?



src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java
<https://reviews.apache.org/r/1463/#comment3220>

    make this proper javadoc



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3222>

    i think now might be a good time to fill in javadoc for every single variable in this
class (let's use this as an opportunity to clean up HLog a bit)



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3223>

    We shouldn't hard-code anything like this.  It should probably be marked final and set
in the constructor.  You can always have a DEFAULT_DELAYED_RESPONSE_MAX_SIZE or something.



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3224>

    final?



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3225>

    is this the best description for what this does?  my understanding is that there is actually
always going to be batching, and it's not that this is really about making them any more aggressive
(in all cases it loops around and immediately syncs anything pending).  this is really about...
a non-blocking append, freeing of the handler thread, and the sync being called and response
returned from a separate thread.  (a mouthful but there must be some concise way to describe
this)



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3226>

    This is all static so there might be issues running multiple HRegionServers in a single
JVM during unit tests?



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3227>

    since you've kept the old constructors, why did we need to update anything elsewhere passing
in null?



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3228>

    what does this mean?  we should also add some documentation to the default xml and/or
book about the new config params and this feature



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3229>

    yeah this is fine, so just don't set it when you declare it and make it final



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3230>

    should this throw an exception?  this seems like a bad configuration (user error) and
we should halt instead of continue.



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3232>

    this this while() { yield() } the best pattern to use here?  Any reason not to use the
typical wait/notify pattern?



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3231>

    looks like there are double spaces on each line wrap



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3234>

    What's the contract of this method?  When should this be called?  Does it only append
or does it also sync?  Is it blocking or non-blocking and what does it do with the RWCC if
i pass it in?



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3233>

    javadoc still needs to be updated



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3235>

    there's a bunch of javadoc on this method.  it's unclear when to use this method vs. the
other above it.



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3236>

    should be info?  and need to add rwcc



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3237>

    You can get fancy with javadoc notation by referencing the classes/interfaces:  {@link
Delayable} for example.



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3238>

    final



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3239>

    Any updates to the description of this thread?



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3240>

    why no longer final?



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3241>

    comment



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3244>

    these comments can all be changed to be the form:
    
    /** Comment about what this variable is */
    
    (then it will be loaded up as javadoc in eclipse and such)



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3242>

    and is used when?  when the server is shutting down?



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3243>

    what happens if you overrun max?  (seems like this feature could use a doc describing
what it does, the trade-offs, and configuration parameters)



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3245>

    I don't think we want to just do the printStackTrace() here which will go to stdout...
You'll want to LOG.error("Something", e)



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3246>

    it doesn't always wait.  this method deserves a full javadoc



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3247>

    maybe maxDelayedResponses would be a better variable name



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3248>

    double space



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3249>

    convert this to javadoc and maybe add more description... it's not really the case that
"All RPC threads that had delayed Responses are now allowed"... it's actually the case that
it will send the responses for anything that was delayed and is now synced (some stuff could
be delayed but not yet synced, thus not responded to)



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3250>

    blocking?  non-blocking?  this just queues and doesn't actually write to HLog?



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3251>

    When is this called



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3252>

    same here.  can you add some comments about when this is used



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3253>

    javadoc



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3254>

    need to update javadoc



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3255>

    javadoc



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3257>

    don't you want to request a log roll if logRollRequested == true || sizeExceeded ?  (why
do we need sizeExceeded at all?  couldn't we set logRollRequested=true?)



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3258>

    javadoc



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3259>

    javadoc... let's be sure to clearly state the contract of these public methods



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3261>

    this method could also use some doc (i know you didn't add this but would be very helpful)



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3260>

    this looks like a change in behavior?  previously the postWALWrite would happen even if
this method returned true... now if it returns true we leave immediately



src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
<https://reviews.apache.org/r/1463/#comment3262>

    so looks like this option is not compatible with coprocessors?  or does this postWALWrite
move elsewhere?
    
    this looks like the preWALWrite will get called if batchEntries is true but the postWALWrite
never will



src/main/java/org/apache/hadoop/hbase/util/HMerge.java
<https://reviews.apache.org/r/1463/#comment3263>

    necessary change?  (here and remaining changes)



src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestBatchEntriesLogRolling.java
<https://reviews.apache.org/r/1463/#comment3264>

    2011



src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestBatchEntriesLogRolling.java
<https://reviews.apache.org/r/1463/#comment3265>

    there are some ways to run all the unit tests with a varying set of initial parameters...
look at TestCacheOnWrite on the new HFile v2 patch, it uses @RunWith(Parameterized.class)
    


- Jonathan


On 2011-08-11 06:59:28, Vlad Dogaru wrote:
bq.  
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1463/
bq.  -----------------------------------------------------------
bq.  
bq.  (Updated 2011-08-11 06:59:28)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  -------
bq.  
bq.  Changes:
bq.  
bq.  1.  Add hbase.region.wal.batchentries configuration parameter.  If this is
bq.  enabled, batch entries to the HLog in a queue.
bq.  2.  Use delayed RPCs for sync requests when aggresive batching is enabled.
bq.  This frees up RPC handler threads for the duration of the sync.
bq.  3.  Pass the RPC server instance all the way to down to HLog.  This is needed
bq.  to find out the current remote call, mark it as delayed, and finally complete
bq.  it when the sync is done.
bq.  4.  Use the region read-write consistency control to avoid exposing to
bq.  RegionScanners edits which have not yet been synced.
bq.  5.  Change a few tests which directly create HRegions or HLogs.  The
bq.  rpcServers passed in are null, HLog falls back to classic RPCs when it has no
bq.  knowledge of the RPC server.
bq.  6.  Add TestBatchEntriesLogRolling, which is identical to TestLogRolling,
bq.  except that it uses aggressive batching.  I'm not sure how to add tests that
bq.  verify the same functionality but don't duplicate code, suggestion are
bq.  welcome.
bq.  
bq.  The new parameter is disabled by default.
bq.  
bq.  
bq.  This addresses bug HBASE-4192.
bq.      https://issues.apache.org/jira/browse/HBASE-4192
bq.  
bq.  
bq.  Diffs
bq.  -----
bq.  
bq.    src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java 6fb1da7 
bq.    src/main/java/org/apache/hadoop/hbase/master/HMaster.java a00b93d 
bq.    src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java 2f86f04 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java 83ff7b2 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java f22fb6e 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/ReadWriteConsistencyControl.java
8ec53d3 
bq.    src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java 887f736 
bq.    src/main/java/org/apache/hadoop/hbase/util/HMerge.java 9f0499e 
bq.    src/main/java/org/apache/hadoop/hbase/util/MetaUtils.java af8d734 
bq.    src/test/java/org/apache/hadoop/hbase/HBaseTestCase.java bf8004b 
bq.    src/test/java/org/apache/hadoop/hbase/TestScanMultipleVersions.java d7baba4 
bq.    src/test/java/org/apache/hadoop/hbase/coprocessor/TestCoprocessorInterface.java 36816e8

bq.    src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestBatchEntriesLogRolling.java
PRE-CREATION 
bq.    src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLogSplit.java de28418 
bq.    src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALActionsListener.java
dc43eb2 
bq.    src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java
381ac90 
bq.  
bq.  Diff: https://reviews.apache.org/r/1463/diff
bq.  
bq.  
bq.  Testing
bq.  -------
bq.  
bq.  All unit tests run with aggressive batching turned on and off.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Vlad
bq.  
bq.



> Optimize HLog for Throughput Using Delayed RPCs
> -----------------------------------------------
>
>                 Key: HBASE-4192
>                 URL: https://issues.apache.org/jira/browse/HBASE-4192
>             Project: HBase
>          Issue Type: New Feature
>          Components: wal
>    Affects Versions: 0.92.0
>            Reporter: Vlad Dogaru
>            Priority: Minor
>
> Introduce a new HLog configuration parameter (batchEntries) for more aggressive batching
of appends.  If this is enabled, HLog appends are not written to the HLog writer immediately,
but batched and written either periodically or when a sync is requested.  Because sync times
become larger, they use delayed RPCs to free up RPC handler threads.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message