hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-15287) mapreduce.RowCounter returns incorrect result with binary row key inputs
Date Sun, 17 Apr 2016 22:15:25 GMT

    [ https://issues.apache.org/jira/browse/HBASE-15287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15244942#comment-15244942
] 

Hudson commented on HBASE-15287:
--------------------------------

SUCCESS: Integrated in HBase-1.3 #654 (See [https://builds.apache.org/job/HBase-1.3/654/])
HBASE-15287 mapreduce.RowCounter returns incorrect result with binary (tedyu: rev 265a4d6958b23848fe661b9a794154b115fcd371)
* hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestImportExport.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/GroupingTableMapper.java
* hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestCopyTable.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/SimpleTotalOrderPartitioner.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormat.java
* hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestCellCounter.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/CopyTable.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/RowCounter.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/mapred/GroupingTableMap.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/Export.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/CellCounter.java


> mapreduce.RowCounter returns incorrect result with binary row key inputs
> ------------------------------------------------------------------------
>
>                 Key: HBASE-15287
>                 URL: https://issues.apache.org/jira/browse/HBASE-15287
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce, util
>    Affects Versions: 1.1.1
>            Reporter: Randy Hu
>            Assignee: Matt Warhaftig
>             Fix For: 2.0.0, 1.3.0, 1.4.0
>
>         Attachments: 15287-v2.patch, hbase-15287-branch-1-v1.patch, hbase-15287-v1.patch,
hbase-15287-v2.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> org.apache.hadoop.hbase.mapreduce.RowCounter takes optional start/end key as inputs (-range
option). It would work only when the string representation of value is identical to the string.
When row key is binary,  the string representation of the value would look like this: "\x00\x01",
which would be incorrect interpreted as 8 char string in the current implementation:
> https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/RowCounter.java
> To fix that, we need change how the value is converted from command line inputs:
> Change 
>       scan.setStartRow(Bytes.toBytes(startKey));
> to
>       scan.setStartRow(Bytes.toBytesBinary(startKey));
> Do the same conversion to end key as well.
> The issue was discovered when the utility was used to calcualte row distribution on regions
from table with binary row keys. The hbase:meta contains the start key of each region in format
of above example. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message