Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm
Precedence: bulk
Date: Fri, 21 Apr 2017 04:23:04 +0000 (UTC)
From: "Vikas Vishwakarma (JIRA)" <jira@apache.org>
To: common-issues@hadoop.apache.org
Message-ID: <JIRA.13064314.1492395688000.14703.1492748584409@Atlassian.JIRA>
In-Reply-To: <JIRA.13064314.1492395688000@Atlassian.JIRA>
References: <JIRA.13064314.1492395688000@Atlassian.JIRA> <JIRA.13064314.1492395688757@jira-lw-us.apache.org>
Subject: [jira] [Commented] (HADOOP-14313) Replace/improve Hadoop's byte[]
 comparator
MIME-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
archived-at: Fri, 21 Apr 2017 04:23:09 -0000


    [ https://issues.apache.org/jira/browse/HADOOP-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15978048#comment-15978048 ] 

Vikas Vishwakarma commented on HADOOP-14313:
--------------------------------------------

[~busbey] I was looking at the changes to directly use Guava lib, but got one doubt. Guava compare method accepts only two arguments left/right byte array whereas hadoop compare uses left/right byte array, offset from where to compare and length of the array. If we want to directly call Guava compare, we will have to change the method call everywhere in the Hadoop code also we will have to do separate handling for array offsets. 

Guava 
{code}
public int compare(byte[] left, byte[] right) { }
{code}

Hadoop 
{code}
public int compareTo(byte[] buffer1, int offset1, int length1,
          byte[] buffer2, int offset2, int length2) {
{code}

> Replace/improve Hadoop's byte[] comparator
> ------------------------------------------
>
>                 Key: HADOOP-14313
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14313
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: common
>            Reporter: Vikas Vishwakarma
>         Attachments: HADOOP-14313.master.001.patch
>
>
> Hi,
> Recently we were looking at the Lexicographic byte array comparison in HBase. We did microbenchmark for the byte array comparator of HADOOP ( https://github.com/hanborq/hadoop/blob/master/src/core/org/apache/hadoop/io/FastByteComparisons.java#L161 ) , HBase Vs the latest byte array comparator from guava  ( https://github.com/google/guava/blob/master/guava/src/com/google/common/primitives/UnsignedBytes.java#L362 ) and observed that the guava main branch version is much faster. 
> Specifically we see very good improvement when the byteArraySize%8 != 0 and also for large byte arrays. I will update the benchmark results using JMH for Hadoop vs Guava. For the jira on HBase, please refer HBASE-17877. 


--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org