Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 5E1ED19673 for ; Sat, 16 Apr 2016 19:46:26 +0000 (UTC) Received: (qmail 47307 invoked by uid 500); 16 Apr 2016 19:46:25 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 47217 invoked by uid 500); 16 Apr 2016 19:46:25 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 47144 invoked by uid 99); 16 Apr 2016 19:46:25 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 16 Apr 2016 19:46:25 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 91F842C1F6A for ; Sat, 16 Apr 2016 19:46:25 +0000 (UTC) Date: Sat, 16 Apr 2016 19:46:25 +0000 (UTC) From: "Ted Yu (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HBASE-15287) mapreduce.RowCounter returns incorrect result with binary row key inputs MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-15287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-15287: --------------------------- Summary: mapreduce.RowCounter returns incorrect result with binary row key inputs (was: org.apache.hadoop.hbase.mapreduce.RowCounter returns incorrect result with binary row key inputs) > mapreduce.RowCounter returns incorrect result with binary row key inputs > ------------------------------------------------------------------------ > > Key: HBASE-15287 > URL: https://issues.apache.org/jira/browse/HBASE-15287 > Project: HBase > Issue Type: Bug > Components: mapreduce, util > Affects Versions: 1.1.1 > Reporter: Randy Hu > Assignee: Matt Warhaftig > Attachments: 15287-v2.patch, hbase-15287-v1.patch, hbase-15287-v2.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > org.apache.hadoop.hbase.mapreduce.RowCounter takes optional start/end key as inputs (-range option). It would work only when the string representation of value is identical to the string. When row key is binary, the string representation of the value would look like this: "\x00\x01", which would be incorrect interpreted as 8 char string in the current implementation: > https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/RowCounter.java > To fix that, we need change how the value is converted from command line inputs: > Change > scan.setStartRow(Bytes.toBytes(startKey)); > to > scan.setStartRow(Bytes.toBytesBinary(startKey)); > Do the same conversion to end key as well. > The issue was discovered when the utility was used to calcualte row distribution on regions from table with binary row keys. The hbase:meta contains the start key of each region in format of above example. -- This message was sent by Atlassian JIRA (v6.3.4#6332)