hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jonathan Hsieh (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5719) Enhance hbck to sideline overlapped mega regions
Date Thu, 05 Apr 2012 23:58:24 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5719?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13247865#comment-13247865
] 

Jonathan Hsieh commented on HBASE-5719:
---------------------------------------

More context:

We ran into a corrupted cluster that had encountered HBASE-4238 and had several generations
of "grandparent" and regions lingering in HDFS.  If you looked at a region map, we had overlapping
regions that looked like this:

[A-I], [A-E], [E-H], [A-C], [A-B], [B-C] ... 

The HBASE-5128 version of hbck would see that all these regions fit inside of A-I and then
attempt to merge the all into one mega region.  This is technically correct but could result
merging all the regions in an overlap group into one region that was significantly larger
than all others (worst case all regions of a table could get combined into one region).  HBASE-5128
includes some safeguards to prevent these "mega merges".  In order to fix these situations,
we sidelined (close, offline, move to different dir) the grandparent regions with the largest
overlapped with the most other regions.  This leaves us with many small groups of overlapping
regions instead of a single large set of overlapping regions.  These smaller regions could
be safely repaired automatically via merges, and any data from the sidelined grandparent regions
could be restored via a bulk load later on.

So in the example above, the [A-I], [A-E], [E-H] grandparent regions would get sidelined,
and leaving us with [A-C], [A-B],[B-C].  These smaller regions could get safely merged automatically
into a single [A-C]' region.  We'd then bulk load [A-I], [A-E], and [E-H] regions back in
afterwards to restore data.

The goal of this patch is to automatically id and sideline overlapping grandparent regions.


                
> Enhance hbck to sideline overlapped mega regions
> ------------------------------------------------
>
>                 Key: HBASE-5719
>                 URL: https://issues.apache.org/jira/browse/HBASE-5719
>             Project: HBase
>          Issue Type: New Feature
>          Components: hbck
>    Affects Versions: 0.94.0, 0.96.0
>            Reporter: Jimmy Xiang
>            Assignee: Jimmy Xiang
>             Fix For: 0.96.0
>
>         Attachments: hbase-5719.patch
>
>
> If there are too many regions in one overlapped group (by default, more than 10), hbck
currently doesn't merge them since it takes time.
> In this case, we can sideline some regions in the group and break the overlapping to
fix the inconsistency.  Later on, sidelined regions can be bulk loaded manually.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message