hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Dimiduk (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-8865) HBase shell split command acts incorrectly with hex split keys.
Date Thu, 11 Jul 2013 23:09:49 GMT

    [ https://issues.apache.org/jira/browse/HBASE-8865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13706409#comment-13706409
] 

Nick Dimiduk commented on HBASE-8865:
-------------------------------------

That's a little clunky but should expose the feature [~dinghaifeng] is looking for. I wonder
if the shell could use a heuristic, look for strings that "look like" binary strings and act
accordingly. I hesitate to put too much effort into making the shell "smart" though. I think
your listed workaround is actually the most correct solution, because neither you nor the
shell are confused regarding your intention to use a {{byte[]}}.

What if we patch the help message on shell's {{split}} command to explicitly advise the user
about this scenario?
                
> HBase shell split command acts incorrectly with hex split keys.
> ---------------------------------------------------------------
>
>                 Key: HBASE-8865
>                 URL: https://issues.apache.org/jira/browse/HBASE-8865
>             Project: HBase
>          Issue Type: Bug
>          Components: shell, Usability
>    Affects Versions: 0.94.8
>         Environment: Linux
>            Reporter: Ding Haifeng
>         Attachments: 8865.txt
>
>
> When I tried to do a manual region split from HBase shell, I found that split command
acts incorrectly with hex split keys. 
> Here is an example.
> I execute hbase(main):003:0> split 'tsdb', "\x00\x00\xC3" .
> While I expect it to split at the 3-byte key "\x00\x00\xC3" , it actually split at a
5-byte key "\x00\x00\xEF\xBF\xBD". 
> I test with more split keys and find some patterns:
> * If the all bytes in the split key represented in hexadecimal are between "\x00" and
"\x7F" , it works as expected and split at exactly the key specified.
> * If there are any bytes between "\x80" and "xFF", it works incorrectly. No matter the
byte is, it is interpreted as "\xEF\xBF\xBD". Here is another example. Specifying split key
"\x00\xA0\x00\xB0" actually splits at "\x00\xEF\xBF\xBD\x00\xEF\xBF\xBD".
> I'm running Hbase 0.94.8, r1485407, both server-side and client-side. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message