Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 50D591059C for ; Thu, 11 Jul 2013 23:09:49 +0000 (UTC) Received: (qmail 81231 invoked by uid 500); 11 Jul 2013 23:09:49 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 81198 invoked by uid 500); 11 Jul 2013 23:09:49 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 81189 invoked by uid 99); 11 Jul 2013 23:09:49 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 11 Jul 2013 23:09:49 +0000 Date: Thu, 11 Jul 2013 23:09:49 +0000 (UTC) From: "Nick Dimiduk (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-8865) HBase shell split command acts incorrectly with hex split keys. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13706409#comment-13706409 ] Nick Dimiduk commented on HBASE-8865: ------------------------------------- That's a little clunky but should expose the feature [~dinghaifeng] is looking for. I wonder if the shell could use a heuristic, look for strings that "look like" binary strings and act accordingly. I hesitate to put too much effort into making the shell "smart" though. I think your listed workaround is actually the most correct solution, because neither you nor the shell are confused regarding your intention to use a {{byte[]}}. What if we patch the help message on shell's {{split}} command to explicitly advise the user about this scenario? > HBase shell split command acts incorrectly with hex split keys. > --------------------------------------------------------------- > > Key: HBASE-8865 > URL: https://issues.apache.org/jira/browse/HBASE-8865 > Project: HBase > Issue Type: Bug > Components: shell, Usability > Affects Versions: 0.94.8 > Environment: Linux > Reporter: Ding Haifeng > Attachments: 8865.txt > > > When I tried to do a manual region split from HBase shell, I found that split command acts incorrectly with hex split keys. > Here is an example. > I execute hbase(main):003:0> split 'tsdb', "\x00\x00\xC3" . > While I expect it to split at the 3-byte key "\x00\x00\xC3" , it actually split at a 5-byte key "\x00\x00\xEF\xBF\xBD". > I test with more split keys and find some patterns: > * If the all bytes in the split key represented in hexadecimal are between "\x00" and "\x7F" , it works as expected and split at exactly the key specified. > * If there are any bytes between "\x80" and "xFF", it works incorrectly. No matter the byte is, it is interpreted as "\xEF\xBF\xBD". Here is another example. Specifying split key "\x00\xA0\x00\xB0" actually splits at "\x00\xEF\xBF\xBD\x00\xEF\xBF\xBD". > I'm running Hbase 0.94.8, r1485407, both server-side and client-side. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira