accumulo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ed Kohlwey <ekohl...@gmail.com>
Subject Re: Setting Charset in getBytes() call.
Date Sun, 28 Oct 2012 22:18:24 GMT
If you use a private static field in each class for the charset, it will
basically be a singleton because charsets are cached in char set.forname.
IMHO this is a somewhat cleaner approach than having lots of static imports
to utility classes with lots of constants in them.
On Oct 28, 2012 5:50 PM, "David Medinets" <david.medinets@gmail.com> wrote:

>
> https://issues.apache.org/jira/browse/ACCUMULO-241?focusedCommentId=13449680&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13449680
>
> In this comment, John mentioned that all getBytes() method calls
> should be changed to use UTF8. There are about 1,800 getBytes() calls
> and not all of them involve String objects. I am working on ways to
> identify a subset of these calls to change.
>
> I have created https://issues.apache.org/jira/browse/ACCUMULO-836 to
> track this issue.
>
> Should we create one static Charset object?
>
>   Class AccumuloDefaultCharset {
>     public static Charset UTF8 = Charset.forName("UTF8");
>   }
>
> Should we use a static constant?
>
>   public static String UTF8 = "UTF8";
>
> I have found one instance of getBytes() in InputFormatBase:
>
>   protected static byte[] getPassword(Configuration conf) {
>     return Base64.decodeBase64(conf.get(PASSWORD, "").getBytes());
>   }
>
> Are there any reasons why I can't start specifying the charset? Is
> UTF8 the right Charset to use? I am not an expert in non-English
> charsets, so guidance would be welcome.
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message