Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D9512192C4 for ; Sat, 16 Apr 2016 15:46:25 +0000 (UTC) Received: (qmail 58864 invoked by uid 500); 16 Apr 2016 15:46:25 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 58815 invoked by uid 500); 16 Apr 2016 15:46:25 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 58801 invoked by uid 99); 16 Apr 2016 15:46:25 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 16 Apr 2016 15:46:25 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id 7CC012C1F64 for ; Sat, 16 Apr 2016 15:46:25 +0000 (UTC) Date: Sat, 16 Apr 2016 15:46:25 +0000 (UTC) From: "Ted Yu (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Updated] (HBASE-15287) org.apache.hadoop.hbase.mapreduce.RowCounter returns incorrect result with binary row key inputs MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-15287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-15287: --------------------------- Attachment: 15287-v2.patch Reattaching for QA run. > org.apache.hadoop.hbase.mapreduce.RowCounter returns incorrect result with binary row key inputs > ------------------------------------------------------------------------------------------------ > > Key: HBASE-15287 > URL: https://issues.apache.org/jira/browse/HBASE-15287 > Project: HBase > Issue Type: Bug > Components: mapreduce, util > Affects Versions: 1.1.1 > Reporter: Randy Hu > Assignee: Matt Warhaftig > Attachments: 15287-v2.patch, hbase-15287-v1.patch, hbase-15287-v2.patch > > Original Estimate: 24h > Remaining Estimate: 24h > > org.apache.hadoop.hbase.mapreduce.RowCounter takes optional start/end key as inputs (-range option). It would work only when the string representation of value is identical to the string. When row key is binary, the string representation of the value would look like this: "\x00\x01", which would be incorrect interpreted as 8 char string in the current implementation: > https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/RowCounter.java > To fix that, we need change how the value is converted from command line inputs: > Change > scan.setStartRow(Bytes.toBytes(startKey)); > to > scan.setStartRow(Bytes.toBytesBinary(startKey)); > Do the same conversion to end key as well. > The issue was discovered when the utility was used to calcualte row distribution on regions from table with binary row keys. The hbase:meta contains the start key of each region in format of above example. -- This message was sent by Atlassian JIRA (v6.3.4#6332)