Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id B57CB10A19 for ; Tue, 3 Dec 2013 18:04:38 +0000 (UTC) Received: (qmail 15174 invoked by uid 500); 3 Dec 2013 18:04:38 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 15131 invoked by uid 500); 3 Dec 2013 18:04:38 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 15090 invoked by uid 99); 3 Dec 2013 18:04:38 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Dec 2013 18:04:38 +0000 Date: Tue, 3 Dec 2013 18:04:38 +0000 (UTC) From: "Roman Nikitchenko (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-10017) HRegionPartitioner, rows directed to last partition are wrongly mapped. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-10017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13837964#comment-13837964 ] Roman Nikitchenko commented on HBASE-10017: ------------------------------------------- It looks like this partitioner defect also can cause some data loss in incremental HFile loading based on HFileOutputFormat configured MR job. This is because they use such partitioner to partition records between HFile and records might get to wrong region. Too much effect from too easy to fix issue. > HRegionPartitioner, rows directed to last partition are wrongly mapped. > ----------------------------------------------------------------------- > > Key: HBASE-10017 > URL: https://issues.apache.org/jira/browse/HBASE-10017 > Project: HBase > Issue Type: Bug > Components: mapreduce > Affects Versions: 0.94.6 > Reporter: Roman Nikitchenko > Priority: Critical > Attachments: HBASE-10017-r1544633.patch, HBASE-10017-r1544633.patch, patchSiteOutput.txt > > > Inside HRegionPartitioner class there is getPartition() method which should map first numPartitions regions to appropriate partitions 1:1. But based on condition last region is hashed which could lead to last reducer not having any data. This is considered serious issue. > I reproduced this only starting from 16 regions per table. Original defect was found in 0.94.6 but at least today's trunk and 0.91 branch head have the same HRegionPartitioner code in this part which means the same issue. -- This message was sent by Atlassian JIRA (v6.1#6144)