hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Nick Dimiduk (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped.
Date Thu, 05 Dec 2013 02:15:39 GMT

    [ https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13839703#comment-13839703
] 

Nick Dimiduk commented on HBASE-10017:
--------------------------------------

Multiple splits are handled through retrying. Splits are made and the halves rewritten as
independent HFiles with each pass, so this should be okay.

[~rn] I'm very concerned about the bulkload data loss issue, but I cannot reproduce it using
our existing unit tests (TestHRegionServerBulkLoad). Are you able to demonstrate the loss
in a test? As [~enis] said, TOP should be used for generating HFiles files. Bulkload itself
isn't performed inside a mapreduce job, so I'm confused about how the HRegionPartitioner comes
into play in this scenario.

> HRegionPartitioner, rows directed to last partition are wrongly mapped.
> -----------------------------------------------------------------------
>
>                 Key: HBASE-10017
>                 URL: https://issues.apache.org/jira/browse/HBASE-10017
>             Project: HBase
>          Issue Type: Bug
>          Components: mapreduce
>    Affects Versions: 0.94.6
>            Reporter: Roman Nikitchenko
>            Priority: Critical
>         Attachments: HBASE-10017-r1544633.patch, HBASE-10017-r1544633.patch, patchSiteOutput.txt
>
>
> Inside HRegionPartitioner class there is getPartition() method which should map first
numPartitions regions to appropriate partitions 1:1. But based on condition last region is
hashed which could lead to last reducer not having any data. This is considered serious issue.
> I reproduced this only starting from 16 regions per table. Original defect was found
in 0.94.6 but at least today's trunk and 0.91 branch head have the same HRegionPartitioner
code in this part which means the same issue.



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Mime
View raw message