Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 66A7472A0 for ; Thu, 13 Oct 2011 01:13:33 +0000 (UTC) Received: (qmail 83544 invoked by uid 500); 13 Oct 2011 01:13:33 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 83512 invoked by uid 500); 13 Oct 2011 01:13:33 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 83504 invoked by uid 99); 13 Oct 2011 01:13:33 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Oct 2011 01:13:33 +0000 X-ASF-Spam-Status: No, hits=-2000.5 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 13 Oct 2011 01:13:32 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id C1DC7305D24 for ; Thu, 13 Oct 2011 01:13:11 +0000 (UTC) Date: Thu, 13 Oct 2011 01:13:11 +0000 (UTC) From: "Jonathan Gray (Commented) (JIRA)" To: issues@hbase.apache.org Message-ID: <1804889637.7560.1318468391795.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1774716834.337.1317074472798.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HBASE-4489) Better key splitting in RegionSplitter MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13126282#comment-13126282 ] Jonathan Gray commented on HBASE-4489: -------------------------------------- Historically ASCII has proven a bad choice in key design. If it's always fixed length, it's less of a big deal and really does come down to space savings vs. readability. In many applications, row keys are composite keys made up of many different things. Often times, the key may be preceded by some fixed-length random hash of some sort. I almost always want to be building these composite keys from fixed-length binary ints/longs and such, rather than fixed-length ascii characters. If we are talking a straightforward key-val situation with a string-like key, then the usability of ASCII would make sense. > Better key splitting in RegionSplitter > -------------------------------------- > > Key: HBASE-4489 > URL: https://issues.apache.org/jira/browse/HBASE-4489 > Project: HBase > Issue Type: Improvement > Affects Versions: 0.90.4 > Reporter: Dave Revell > Assignee: Dave Revell > Attachments: HBASE-4489-branch0.90-v1.patch, HBASE-4489-branch0.90-v2.patch, HBASE-4489-branch0.90-v3.patch, HBASE-4489-trunk-v1.patch, HBASE-4489-trunk-v2.patch, HBASE-4489-trunk-v3.patch > > > The RegionSplitter utility allows users to create a pre-split table from the command line or do a rolling split on an existing table. It supports pluggable split algorithms that implement the SplitAlgorithm interface. The only/default SplitAlgorithm is one that assumes keys fall in the range from ASCII string "00000000" to ASCII string "7FFFFFFF". This is not a sane default, and seems useless to most users. Users are likely to be surprised by the fact that all the region splits occur in in the byte range of ASCII characters. > A better default split algorithm would be one that evenly divides the space of all bytes, which is what this patch does. Making a table with five regions would split at \x33\x33..., \x66\x66...., \x99\x99..., \xCC\xCC..., and \xFF\xFF. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira