hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-10501) Make IncreasingToUpperBoundRegionSplitPolicy configurable
Date Tue, 11 Feb 2014 18:26:19 GMT
Lars Hofhansl created HBASE-10501:

             Summary: Make IncreasingToUpperBoundRegionSplitPolicy configurable
                 Key: HBASE-10501
                 URL: https://issues.apache.org/jira/browse/HBASE-10501
             Project: HBase
          Issue Type: Bug
            Reporter: Lars Hofhansl

During some (admittedly) artificial load testing we found a large amount split activity, which
we tracked down the IncreasingToUpperBoundRegionSplitPolicy.

The current logic is this (from the comment)
"regions that are on this server that all are of the same table, squared, times the region
flush size OR the maximum region split size, whichever is smaller"

So with a flush size of 128mb and max file size of 20gb, we'd need 13 region of the same table
on an RS to reach the max size.
With 10gb file sized it is still 9 regions of the same table.
Considering that the number of regions that an RS can carry is limited and might be multiple
tables, this should be more configurable.

I think the squaring is smart and we do not need to change it.

We could
* Make the start size configurable and default it to the flush size
* Add multiplier for the initial size, i.e. start with n * flushSize

Of course one can override the default split policy, but these seem like simple tweaks.

Or we could instead set the goal of how many regions of the same table would need to be present
in order to reach the max size. In that case we'd start with maxSize/goal^2. So if max size
is 20gb and the goal is three we'd start with 20g/9 = 2.2g for the initial region size.

[~stack], I'm interested in your opinion.

This message was sent by Atlassian JIRA

View raw message