hadoop-common-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Vikas Vishwakarma (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HADOOP-14313) Replace/improve Hadoop's byte[] comparator
Date Mon, 24 Apr 2017 11:54:04 GMT

    [ https://issues.apache.org/jira/browse/HADOOP-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15981047#comment-15981047
] 

Vikas Vishwakarma commented on HADOOP-14313:
--------------------------------------------

[~busbey] [~stevel@apache.org] was looking at the compare calls in Hadoop, looks like the
offset is being used at many places like MapTask, SecondarySort, etc. Only way to use Guava
directly for such cases would be to create array copies adjusted for offset and length, which
might be bad. I tried to check if Guava has any API that supports offset and length for compare
but could not find any. Let me know your thoughts on this either copy the Guava logic into
Hadoop compare or leave it as it is or if I am missing some better way to handle this?

> Replace/improve Hadoop's byte[] comparator
> ------------------------------------------
>
>                 Key: HADOOP-14313
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14313
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: common
>            Reporter: Vikas Vishwakarma
>         Attachments: HADOOP-14313.master.001.patch
>
>
> Hi,
> Recently we were looking at the Lexicographic byte array comparison in HBase. We did
microbenchmark for the byte array comparator of HADOOP ( https://github.com/hanborq/hadoop/blob/master/src/core/org/apache/hadoop/io/FastByteComparisons.java#L161
) , HBase Vs the latest byte array comparator from guava  ( https://github.com/google/guava/blob/master/guava/src/com/google/common/primitives/UnsignedBytes.java#L362
) and observed that the guava main branch version is much faster. 
> Specifically we see very good improvement when the byteArraySize%8 != 0 and also for
large byte arrays. I will update the benchmark results using JMH for Hadoop vs Guava. For
the jira on HBase, please refer HBASE-17877. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org


Mime
View raw message