Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 91B5E9E58 for ; Thu, 5 Apr 2012 23:58:48 +0000 (UTC) Received: (qmail 35910 invoked by uid 500); 5 Apr 2012 23:58:48 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 35858 invoked by uid 500); 5 Apr 2012 23:58:48 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 35850 invoked by uid 99); 5 Apr 2012 23:58:48 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 05 Apr 2012 23:58:48 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 05 Apr 2012 23:58:45 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id C306E35BA3A for ; Thu, 5 Apr 2012 23:58:24 +0000 (UTC) Date: Thu, 5 Apr 2012 23:58:24 +0000 (UTC) From: "Jonathan Hsieh (Commented) (JIRA)" To: issues@hbase.apache.org Message-ID: <1787696404.20551.1333670304800.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <662906683.14690.1333580662554.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HBASE-5719) Enhance hbck to sideline overlapped mega regions MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-5719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13247865#comment-13247865 ] Jonathan Hsieh commented on HBASE-5719: --------------------------------------- More context: We ran into a corrupted cluster that had encountered HBASE-4238 and had several generations of "grandparent" and regions lingering in HDFS. If you looked at a region map, we had overlapping regions that looked like this: [A-I], [A-E], [E-H], [A-C], [A-B], [B-C] ... The HBASE-5128 version of hbck would see that all these regions fit inside of A-I and then attempt to merge the all into one mega region. This is technically correct but could result merging all the regions in an overlap group into one region that was significantly larger than all others (worst case all regions of a table could get combined into one region). HBASE-5128 includes some safeguards to prevent these "mega merges". In order to fix these situations, we sidelined (close, offline, move to different dir) the grandparent regions with the largest overlapped with the most other regions. This leaves us with many small groups of overlapping regions instead of a single large set of overlapping regions. These smaller regions could be safely repaired automatically via merges, and any data from the sidelined grandparent regions could be restored via a bulk load later on. So in the example above, the [A-I], [A-E], [E-H] grandparent regions would get sidelined, and leaving us with [A-C], [A-B],[B-C]. These smaller regions could get safely merged automatically into a single [A-C]' region. We'd then bulk load [A-I], [A-E], and [E-H] regions back in afterwards to restore data. The goal of this patch is to automatically id and sideline overlapping grandparent regions. > Enhance hbck to sideline overlapped mega regions > ------------------------------------------------ > > Key: HBASE-5719 > URL: https://issues.apache.org/jira/browse/HBASE-5719 > Project: HBase > Issue Type: New Feature > Components: hbck > Affects Versions: 0.94.0, 0.96.0 > Reporter: Jimmy Xiang > Assignee: Jimmy Xiang > Fix For: 0.96.0 > > Attachments: hbase-5719.patch > > > If there are too many regions in one overlapped group (by default, more than 10), hbck currently doesn't merge them since it takes time. > In this case, we can sideline some regions in the group and break the overlapping to fix the inconsistency. Later on, sidelined regions can be bulk loaded manually. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira