hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nicolas Spiegelberg (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-3877) Determine Proper Defaults for Compaction ThreadPools
Date Wed, 11 May 2011 21:27:47 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3877?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13032086#comment-13032086
] 

Nicolas Spiegelberg commented on HBASE-3877:
--------------------------------------------

For some data point.  In our cluster, we do not automatically split regions and keep our region
count low.  Therefore, we have StoreFiles that reach in the 10GB range.  Obviously, if all
the compaction threads were processing a 10GB compaction, the queue would get stopped up.
 We put the throttle point at 500MB.  Since compactions are network-bound.  We have 1Gbps
network links & are seeing roughly 40MBps speed (3x == 1Gbps), so about 12 sec per compaction
max on the small threadpool.  Therefore, our use case doesn't directly correspond to the common
auto-split use case.

My original thought is to default the throttle to:
{code}
min("hbase.hregion.memstore.flush.size" * 2, "hbase.hregion.max.filesize" / 2)
{code}
Note that the default split/flush ratio is 4, so this number should be in the middle.  Since
most users do compression, the actual flush size should be ~20% of the MemStore size (so flushSize*2
is really more like flushSize*10).  I will submit a patch with this default.  Please feel
free to chime in with your experience using it and we'll see if we can improve this default.

> Determine Proper Defaults for Compaction ThreadPools
> ----------------------------------------------------
>
>                 Key: HBASE-3877
>                 URL: https://issues.apache.org/jira/browse/HBASE-3877
>             Project: HBase
>          Issue Type: Improvement
>          Components: regionserver
>    Affects Versions: 0.92.0
>            Reporter: Nicolas Spiegelberg
>            Assignee: Nicolas Spiegelberg
>            Priority: Trivial
>              Labels: compaction
>
> With the introduction of HBASE-1476, we now have multithreaded compactions + 2 different
ThreadPools for large and small compactions.  However, this is disabled by default until we
can determine a proper default throttle point.  Opening this JIRA to log all discussion on
how to select a good default for this case.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message