hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vikas Vishwakarma (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-14313) Replace/improve Hadoop's byte[] comparator
Date Fri, 21 Apr 2017 04:23:04 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15978048#comment-15978048
] 

Vikas Vishwakarma commented on HADOOP-14313:
--------------------------------------------

[~busbey] I was looking at the changes to directly use Guava lib, but got one doubt. Guava
compare method accepts only two arguments left/right byte array whereas hadoop compare uses
left/right byte array, offset from where to compare and length of the array. If we want to
directly call Guava compare, we will have to change the method call everywhere in the Hadoop
code also we will have to do separate handling for array offsets. 

Guava 
{code}
public int compare(byte[] left, byte[] right) { }
{code}

Hadoop 
{code}
public int compareTo(byte[] buffer1, int offset1, int length1,
          byte[] buffer2, int offset2, int length2) {
{code}

> Replace/improve Hadoop's byte[] comparator
> ------------------------------------------
>
>                 Key: HADOOP-14313
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14313
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: common
>            Reporter: Vikas Vishwakarma
>         Attachments: HADOOP-14313.master.001.patch
>
>
> Hi,
> Recently we were looking at the Lexicographic byte array comparison in HBase. We did
microbenchmark for the byte array comparator of HADOOP ( https://github.com/hanborq/hadoop/blob/master/src/core/org/apache/hadoop/io/FastByteComparisons.java#L161
) , HBase Vs the latest byte array comparator from guava  ( https://github.com/google/guava/blob/master/guava/src/com/google/common/primitives/UnsignedBytes.java#L362
) and observed that the guava main branch version is much faster. 
> Specifically we see very good improvement when the byteArraySize%8 != 0 and also for
large byte arrays. I will update the benchmark results using JMH for Hadoop vs Guava. For
the jira on HBase, please refer HBASE-17877. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message