hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Josh Elser (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-18075) Support namespaces and tables with non-latin alphabetical characters
Date Sat, 20 May 2017 19:03:04 GMT

    [ https://issues.apache.org/jira/browse/HBASE-18075?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16018592#comment-16018592
] 

Josh Elser commented on HBASE-18075:
------------------------------------

bq.  I suppose I should look at patch...

Haha :)

bq. Is IsAlphabetic in regex same as

I would assume that isAlphabetic *should* preclude all of these, but I haven't dug into the
implementation. It's definitely not straightforward to interpret from the code itself (there's
some fun bitshifting going on). Most of these points are just garbage unprintable characters,
but zookeeper does saw some (7F-9F) "don't display well or are confusing".

I'll whip up a quick test and remove if the explicit checks are unnecessary. That's the easiest
way forward.

> Support namespaces and tables with non-latin alphabetical characters
> --------------------------------------------------------------------
>
>                 Key: HBASE-18075
>                 URL: https://issues.apache.org/jira/browse/HBASE-18075
>             Project: HBase
>          Issue Type: Improvement
>          Components: Client
>            Reporter: Josh Elser
>            Assignee: Josh Elser
>             Fix For: 2.0.0
>
>         Attachments: HBASE-18075.001.patch, HBASE-18075.002.patch
>
>
> On the heels of HBASE-18067, it would be nice to support namespaces and tables with names
that fall outside of Latin alphabetical characters and numbers.
> Our current regex for allowable characters is approximately {{\[a-zA-Z0-9\]+}}.
> It would be nice to replace {{a-zA-Z}} with Java's {{\p\{IsAlphabetic\}}} which will
naturally restrict the unicode character space down to just those that are part of the alphabet
for each script (e.g. latin, cyrillic, greek).
> Technically, our possible scope of allowable characters is, best as I can tell, only
limited by the limitations of ZooKeeper itself https://zookeeper.apache.org/doc/r3.4.10/zookeeperProgrammers.html#ch_zkDataModel
(as both table and namespace are created as znodes).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Mime
View raw message