Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id A9610200C6C for ; Fri, 21 Apr 2017 06:23:09 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id A7BAD160BB0; Fri, 21 Apr 2017 04:23:09 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id EFCD9160B9F for ; Fri, 21 Apr 2017 06:23:08 +0200 (CEST) Received: (qmail 65135 invoked by uid 500); 21 Apr 2017 04:23:08 -0000 Mailing-List: contact common-issues-help@hadoop.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list common-issues@hadoop.apache.org Received: (qmail 65124 invoked by uid 99); 21 Apr 2017 04:23:08 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd1-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 21 Apr 2017 04:23:08 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd1-us-west.apache.org (ASF Mail Server at spamd1-us-west.apache.org) with ESMTP id 84F5EC2200 for ; Fri, 21 Apr 2017 04:23:07 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd1-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -99.201 X-Spam-Level: X-Spam-Status: No, score=-99.201 tagged_above=-999 required=6.31 tests=[KAM_ASCII_DIVIDERS=0.8, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_WHITELIST=-100] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd1-us-west.apache.org [10.40.0.7]) (amavisd-new, port 10024) with ESMTP id SF7CFPinEmqF for ; Fri, 21 Apr 2017 04:23:06 +0000 (UTC) Received: from mailrelay1-us-west.apache.org (mailrelay1-us-west.apache.org [209.188.14.139]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTP id 3D2B85F3FE for ; Fri, 21 Apr 2017 04:23:06 +0000 (UTC) Received: from jira-lw-us.apache.org (unknown [207.244.88.139]) by mailrelay1-us-west.apache.org (ASF Mail Server at mailrelay1-us-west.apache.org) with ESMTP id 6F077E0D50 for ; Fri, 21 Apr 2017 04:23:05 +0000 (UTC) Received: from jira-lw-us.apache.org (localhost [127.0.0.1]) by jira-lw-us.apache.org (ASF Mail Server at jira-lw-us.apache.org) with ESMTP id 6473821B5A for ; Fri, 21 Apr 2017 04:23:04 +0000 (UTC) Date: Fri, 21 Apr 2017 04:23:04 +0000 (UTC) From: "Vikas Vishwakarma (JIRA)" To: common-issues@hadoop.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HADOOP-14313) Replace/improve Hadoop's byte[] comparator MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 archived-at: Fri, 21 Apr 2017 04:23:09 -0000 [ https://issues.apache.org/jira/browse/HADOOP-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15978048#comment-15978048 ] Vikas Vishwakarma commented on HADOOP-14313: -------------------------------------------- [~busbey] I was looking at the changes to directly use Guava lib, but got one doubt. Guava compare method accepts only two arguments left/right byte array whereas hadoop compare uses left/right byte array, offset from where to compare and length of the array. If we want to directly call Guava compare, we will have to change the method call everywhere in the Hadoop code also we will have to do separate handling for array offsets. Guava {code} public int compare(byte[] left, byte[] right) { } {code} Hadoop {code} public int compareTo(byte[] buffer1, int offset1, int length1, byte[] buffer2, int offset2, int length2) { {code} > Replace/improve Hadoop's byte[] comparator > ------------------------------------------ > > Key: HADOOP-14313 > URL: https://issues.apache.org/jira/browse/HADOOP-14313 > Project: Hadoop Common > Issue Type: Improvement > Components: common > Reporter: Vikas Vishwakarma > Attachments: HADOOP-14313.master.001.patch > > > Hi, > Recently we were looking at the Lexicographic byte array comparison in HBase. We did microbenchmark for the byte array comparator of HADOOP ( https://github.com/hanborq/hadoop/blob/master/src/core/org/apache/hadoop/io/FastByteComparisons.java#L161 ) , HBase Vs the latest byte array comparator from guava ( https://github.com/google/guava/blob/master/guava/src/com/google/common/primitives/UnsignedBytes.java#L362 ) and observed that the guava main branch version is much faster. > Specifically we see very good improvement when the byteArraySize%8 != 0 and also for large byte arrays. I will update the benchmark results using JMH for Hadoop vs Guava. For the jira on HBase, please refer HBASE-17877. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org For additional commands, e-mail: common-issues-help@hadoop.apache.org