hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Purtell (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-13052) Explain each region split policy
Date Tue, 17 Feb 2015 21:36:11 GMT

    [ https://issues.apache.org/jira/browse/HBASE-13052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14324933#comment-14324933

Andrew Purtell commented on HBASE-13052:

+.Choosing a Split Policy
+To choose a split policy globally or for a given table, it is important to consider the characteristics
of your data, the pattern of the row keys, and the patterns you use to access the data. The
following questions may be helpful when deciding on a region split policy.
+* Are your row keys "chunked" by common prefixes that are useful when scanning? 
+* Are your row keys delimited by specific patterns that are useful when scanning?
+* Is it more important to control the size of your regions, the number of rows in a region,
or the overall size of your store files?
+* Is it important to try to keep multiple regions for the same table on the same RegionServer?
+* For a given table, do different columns hold cells of radically different sizes?
+* Do your needs fall outside the scope of any of the existing region split policies? In this
case, consider implementing a <<region.split.policies.custom,custom split policy>>.

This is an ok start. It would be better if each rhetorical question is answered with a pointer
to one of the policies. Even if only a couple, it would help. 

> Explain each region split policy
> --------------------------------
>                 Key: HBASE-13052
>                 URL: https://issues.apache.org/jira/browse/HBASE-13052
>             Project: HBase
>          Issue Type: Bug
>          Components: documentation
>            Reporter: Misty Stanley-Jones
>            Assignee: Misty Stanley-Jones
>         Attachments: HBASE-13052.patch
> {quote}
> there are five region split policies today so that let's add a section which explains:
> 1. How each policies work. We can start from current java doc:
> http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/regionserver/KeyPrefixRegionSplitPolicy.html
> http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/regionserver/DelimitedKeyPrefixRegionSplitPolicy.html
> http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/regionserver/DisabledRegionSplitPolicy.html
> http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/regionserver/ConstantSizeRegionSplitPolicy.html
> http://hbase.apache.org/devapidocs/org/apache/hadoop/hbase/regionserver/IncreasingToUpperBoundRegionSplitPolicy.html
> 2. How users can choose a good policy per their scenario basis
> 3. Pros and cons over each policies
> {quote}
> from [~daisuke.kobayashi]

This message was sent by Atlassian JIRA

View raw message