hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dave Revell (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4489) Better key splitting in RegionSplitter
Date Tue, 11 Oct 2011 17:07:11 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13125196#comment-13125196
] 

Dave Revell commented on HBASE-4489:
------------------------------------

@Ted: will add license in patch -v3.

@Jonathan: 
 * We did agree to leave MD5StringSplit as the default in 0.90, I'll fix that
 * I have no objection to turning rollingsplit into a different utility. That seems out of
scope here though; my intent in this ticket was to make RegionSplitter do something sane.
 * Unit tests are a good idea, I'll add some
 * Using positive hex with (byte) casting is a good idea, I'll change that



                
> Better key splitting in RegionSplitter
> --------------------------------------
>
>                 Key: HBASE-4489
>                 URL: https://issues.apache.org/jira/browse/HBASE-4489
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.4
>            Reporter: Dave Revell
>            Assignee: Dave Revell
>         Attachments: HBASE-4489-branch0.90-v1.patch, HBASE-4489-branch0.90-v2.patch,
HBASE-4489-trunk-v1.patch, HBASE-4489-trunk-v2.patch
>
>
> The RegionSplitter utility allows users to create a pre-split table from the command
line or do a rolling split on an existing table. It supports pluggable split algorithms that
implement the SplitAlgorithm interface. The only/default SplitAlgorithm is one that assumes
keys fall in the range from ASCII string "00000000" to ASCII string "7FFFFFFF". This is not
a sane default, and seems useless to most users. Users are likely to be surprised by the fact
that all the region splits occur in in the byte range of ASCII characters.
> A better default split algorithm would be one that evenly divides the space of all bytes,
which is what this patch does. Making a table with five regions would split at \x33\x33...,
\x66\x66...., \x99\x99..., \xCC\xCC..., and \xFF\xFF.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message