accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Keith Turner (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-241) Visibility labels should blacklist non-ASCII characters instead of whitelisting select ASCII characters
Date Tue, 04 Sep 2012 19:15:07 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-241?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13447960#comment-13447960
] 

Keith Turner commented on ACCUMULO-241:
---------------------------------------

FYI

I read up on UTF-8 [1] to see if it would work w/ the quoting changes I made.  It seems like
UTF-8 within quotes in a visibility expression will work just fine.  So theoretically Accumulo
visibility labels should support non ASCII charsets now.  I was worried that a multi-byte
character may contain a quote byte, however this will not happen w/ UTF-8.  The MSB [2] is
always set to 1 for each byte in a multi-byte UTF-8 encoded char.   Therefore a multi-byte
characater will not contain a quote byte.  When a quote byte occurs in UTF-8 it can only be
the ASCII quote char.

[1]: http://en.wikipedia.org/wiki/UTF-8
[2]: http://en.wikipedia.org/wiki/Most_significant_bit

                
> Visibility labels should blacklist non-ASCII characters instead of whitelisting select
ASCII characters
> -------------------------------------------------------------------------------------------------------
>
>                 Key: ACCUMULO-241
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-241
>             Project: Accumulo
>          Issue Type: Improvement
>    Affects Versions: 1.3.5
>            Reporter: John Vines
>              Labels: visibility
>             Fix For: 1.3.6
>
>         Attachments: ACCUMULO-241-quoting-2.txt, ACCUMULO-241-quoting.txt
>
>
> We currently whitelist our visibility labels to only allow alphanumerics and a few select
delimiting characters. While we strive for human-readable labels, we should instead utilize
a blacklist approach where we disallow parentheses, ampersands, pipes, and any non-ASCII characters.
This will provide users with more flexibility in labeling, while still sticking to human readability.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message