accumulo-notifications mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christopher Tubbs (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ACCUMULO-1738) ArrayByteSequence string conversion use platform default encoding
Date Wed, 25 Sep 2013 21:12:03 GMT

    [ https://issues.apache.org/jira/browse/ACCUMULO-1738?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13778087#comment-13778087
] 

Christopher Tubbs commented on ACCUMULO-1738:
---------------------------------------------

Sure, toString() could work like Arrays.toString()... as long as we don't rely on it for deserialization
anywhere (which we shouldn't). Another option is a Base64 representation (with the same restrictions).

However, the transformation is inevitable. The convenience constructor does the transformation
when the String is converted to bytes. These bytes need to be *some* encoding of the string
characters. I would argue that it should just use UTF-8 internally, unless we want to expose
a Charset argument... which adds API complexity for a fringe use case.

Alternatively, it might be better to deprecate the "convenience" constructor... as Strings
are outside the scope of this class, and consumers of the class can decide for themselves
where their bytes come from. It looks like this constructor is only used in tests in our code
right now.

                
> ArrayByteSequence string conversion use platform default encoding
> -----------------------------------------------------------------
>
>                 Key: ACCUMULO-1738
>                 URL: https://issues.apache.org/jira/browse/ACCUMULO-1738
>             Project: Accumulo
>          Issue Type: Bug
>            Reporter: Bill Havanki
>            Priority: Trivial
>              Labels: newbie
>
> The class {{org.apache.accumulo.core.data.ArrayByteSequence}} has a constructor that
accepts a {{String}}, as well as a {{toString()}} method. Both use the platform default encoding
to convert from characters to bytes and back, which can cause problems. They should explicitly
use {{Constants.UTF8}} for conversion to ensure consistency across platforms.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message