hadoop-hdfs-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Akira Ajisaka (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HDFS-11115) Remove bytes2Array and string2Bytes
Date Tue, 06 Dec 2016 04:23:59 GMT

    [ https://issues.apache.org/jira/browse/HDFS-11115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15724288#comment-15724288
] 

Akira Ajisaka commented on HDFS-11115:
--------------------------------------

I doubt that using String "UTF-8" is a optimization. I did a micro benchmark and the result
is that {{new String(byte, StandardCharsets.UTF-8)}} is faster than {{DFSUtilClient.bytes2String(byte)}}
and {{str.getBytes(StandardCharsets.UTF-8)}} is almost as fast as {{DFSUtilClient.string2Bytes(str)}}.
* https://github.com/aajisaka/hadoop-tools/commit/62c5ea6f459084d5042fe83e9c465e14683f4d18

> Remove bytes2Array and string2Bytes
> -----------------------------------
>
>                 Key: HDFS-11115
>                 URL: https://issues.apache.org/jira/browse/HDFS-11115
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: hdfs, hdfs-client
>            Reporter: Sahil Kang
>            Priority: Minor
>
> In DFSUtilClient.java we have something like:
> {code: language=java}
> public static byte[] string2Bytes(String str) {
>   try {
>     return str.getBytes("UTF-8");
>   } catch (UnsupportedEncodingException e) {
>     throw new IllegalArgumentException("UTF8 decoding is not supported", e);
>   }
> }
> static String bytes2String(byte[] bytes, int offset, int length) {
>   try {
>     return new String(bytes, offset, length, "UTF-8");
>   } catch (UnsupportedEncodingException e) {
>     throw new IllegalArgumentException("UTF8 encoding is not supported", e);
>   }
> }
> {code}
> Using StandardCharsets, these methods become trivial:
> {code: language=java}
> public static byte[] string2Bytes(String str) {
>   return str.getBytes(StandardCharsets.UTF_8);
> }
> static String bytes2String(byte[] bytes, int offset, int length) {
>   return new String(bytes, offset, length, StandardCharsets.UTF_8);
> }
> {code}
> I think we should remove these methods and use StandardCharsets whenever we need to convert
between bytes and strings.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-help@hadoop.apache.org


Mime
View raw message