accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christopher Tubbs (JIRA)" <>
Subject [jira] [Commented] (ACCUMULO-1005) Authorizations and ColumnVisibility API should not accept Charset param
Date Wed, 10 Apr 2013 17:10:15 GMT


Christopher Tubbs commented on ACCUMULO-1005:

To add clarification to this issue, the intent of this ticket assumes the design goal that
visibility labels should be human-readable, and therefore, should only accept human-readable
strings (in whatever language).

Since that's been a design goal of visibility expressions and authorizations for a long time
now, we don't need to support methods that control internal behavior of how Accumulo stores
these strings as bytes. To reduce human errors when it comes to charsets and encodings and
serialization, we should just be consistent in our internal serialization, instead of allowing
users to instruct Accumulo to store it one way, and then another user write code that instructs
Accumulo to read it a different way.

It may, however, make sense to allow convenience methods that accept bytes or ByteBuffer or
Text or whatever, but we should treat these as bytes that have been pre-serialized to our
internal serialization (probably UTF-8), and those convenience methods (if they exist) should
document that the bytes they represent should be of this form. (Though, personally, I think
those should go away also, as all human-readable characters that can be represented in Java
can be represented in UTF-8, and I see no reason to accept anything other than String or CharSequence,
deprecate the rest, and internally serialize/deserialize using UTF-8 bytes.)
> Authorizations and ColumnVisibility API should not accept Charset param
> -----------------------------------------------------------------------
>                 Key: ACCUMULO-1005
>                 URL:
>             Project: Accumulo
>          Issue Type: Bug
>            Reporter: Christopher Tubbs
>            Assignee: Tim Reardon
>             Fix For: 1.5.0
>         Attachments: ACCUMULO-1005.patch
> The Charset parameter was added to the public API for ACCUMULO-241. However, this intermingles
internal serialization/comparison implementation, and the semantics of the public API.
> The Charset parameter effectively instructs Accumulo how to serialize the object. This
can break the comparison with what is stored in the table and is an unnecessary breakage.
> In the public API, we should only accept Strings, and allow any valid java String. Internally,
the serialization of these should consistently be UTF-8.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message