hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vikas Vishwakarma (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-14313) Replace/improve Hadoop's byte[] comparator
Date Tue, 18 Apr 2017 15:22:42 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15972904#comment-15972904
] 

Vikas Vishwakarma commented on HADOOP-14313:
--------------------------------------------

I have updated the microbenchmark result table in the update above. Broadly upto 200 bytes
array size we see 10-20% higher throughput (ops/ms) for larger byte arrays it shows almost
100% higher throughput and the trend indicates that the performance gain increases with larger
byte array sizes

I completely agree that it is upto the Hadoop community to choose the preferred implementation
either through Guava or a local copy in the project. HBase has it's own copy in commons (Bytes.java
& ByteBufferUtils.java) so that is independent of hadoop implementation


> Replace/improve Hadoop's byte[] comparator
> ------------------------------------------
>
>                 Key: HADOOP-14313
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14313
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: common
>            Reporter: Vikas Vishwakarma
>         Attachments: HADOOP-14313.master.001.patch
>
>
> Hi,
> Recently we were looking at the Lexicographic byte array comparison in HBase. We did
microbenchmark for the byte array comparator of HADOOP ( https://github.com/hanborq/hadoop/blob/master/src/core/org/apache/hadoop/io/FastByteComparisons.java#L161
) , HBase Vs the latest byte array comparator from guava  ( https://github.com/google/guava/blob/master/guava/src/com/google/common/primitives/UnsignedBytes.java#L362
) and observed that the guava main branch version is much faster. 
> Specifically we see very good improvement when the byteArraySize%8 != 0 and also for
large byte arrays. I will update the benchmark results using JMH for Hadoop vs Guava. For
the jira on HBase, please refer HBASE-17877. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message