hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "jiraposter@reviews.apache.org (Commented) (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5128) [uber hbck] Online automated repair of table integrity and region consistency problems
Date Sat, 24 Mar 2012 00:07:50 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13237324#comment-13237324

jiraposter@reviews.apache.org commented on HBASE-5128:

bq.  On 2012-03-23 19:53:18, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 191
bq.  > <https://reviews.apache.org/r/3435/diff/6/?file=95002#file95002line191>
bq.  >
bq.  >     Why TreeMap it if its encoded region names?  These are hashes so no value sorting

I think you are right.  The sorting is necessary it the range managing data structure but
not here.  I'll file a follow up for this and the following issue.

bq.  On 2012-03-23 19:53:18, Michael Stack wrote:
bq.  > src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java, line 344
bq.  > <https://reviews.apache.org/r/3435/diff/6/?file=95002#file95002line344>
bq.  >
bq.  >     This almost recommends that HBaseFsck becomes a shell that does nothing but
instantiate another class that does acual fixup.  clearState in that case would throw away
the instantiated 'Fsck' class and create a completely new instance rather than zero out data
members as this does.  For the future.

I'll file a follow on jira for that too.

- jmhsieh

This is an automatically generated e-mail. To reply, visit:

On 2012-03-23 16:13:50, jmhsieh wrote:
bq.  -----------------------------------------------------------
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/3435/
bq.  -----------------------------------------------------------
bq.  (Updated 2012-03-23 16:13:50)
bq.  Review request for hbase, Todd Lipcon, Ted Yu, Michael Stack, and Jean-Daniel Cryans.
bq.  Summary
bq.  -------
bq.  This should nearly be to ready for integration.  This has the same control flow as the
trunk/0.92/0.94 versions but has a few differences.  
bq.  - It needs to track HTableDescritors instead of reading them from the file system.
bq.  - It uses a different HBaseFsckRepair.forceOfflineInZK method -- which for some reason
means we don't need HBASE-5563.
bq.  - Uses HServerAddress instead of ServerName
bq.  This version is close to what we've used on production clusters.
bq.  This addresses bug HBASE-5128.
bq.      https://issues.apache.org/jira/browse/HBASE-5128
bq.  Diffs
bq.  -----
bq.    src/main/java/org/apache/hadoop/hbase/io/hfile/HFile.java 1a4f7f1 
bq.    src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java 3c635d4 
bq.    src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java d47ef10 
bq.    src/main/java/org/apache/hadoop/hbase/master/HMaster.java cd1755f 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java c0aaf65 
bq.    src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java 5916d9c 
bq.    src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java d57bb6b 
bq.    src/main/java/org/apache/hadoop/hbase/util/hbck/TableIntegrityErrorHandler.java PRE-CREATION

bq.    src/main/java/org/apache/hadoop/hbase/util/hbck/TableIntegrityErrorHandlerImpl.java
bq.    src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java d9a2a02 
bq.    src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java 937781d 
bq.    src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsckComparator.java 0599da1 
bq.    src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java dbb97f8 
bq.    src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildBase.java 2b4cac8

bq.    src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildHole.java ebbeead

bq.    src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildOverlap.java
bq.  Diff: https://reviews.apache.org/r/3435/diff
bq.  Testing
bq.  -------
bq.  All TestHBaseFsck unit tests pass.  Currently running full suite.
bq.  Thanks,
bq.  jmhsieh

> [uber hbck] Online automated repair of table integrity and region consistency problems
> --------------------------------------------------------------------------------------
>                 Key: HBASE-5128
>                 URL: https://issues.apache.org/jira/browse/HBASE-5128
>             Project: HBase
>          Issue Type: New Feature
>          Components: hbck
>    Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0
>            Reporter: Jonathan Hsieh
>            Assignee: Jonathan Hsieh
>             Fix For: 0.90.7, 0.92.2, 0.94.0, 0.96.0
>         Attachments: hbase-5128-0.90-v2.patch, hbase-5128-0.90-v2b.patch, hbase-5128-0.90-v4.patch,
hbase-5128-0.92-v2.patch, hbase-5128-0.92-v4.patch, hbase-5128-0.94-v2.patch, hbase-5128-0.94-v4.patch,
hbase-5128-trunk-v2.patch, hbase-5128-trunk.patch, hbase-5128-v3.patch, hbase-5128-v4.patch
> The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region consistency and
table integrity invariant violations.  However with '-fix' it can only automatically repair
region consistency cases having to do with deployment problems.  This updated version should
be able to handle all cases (including a new orphan regiondir case).  When complete will likely
deprecate the OfflineMetaRepair tool and subsume several open META-hole related issue.
> Here's the approach (from the comment of at the top of the new version of the file).
> {code}
> /**
>  * HBaseFsck (hbck) is a tool for checking and repairing region consistency and
>  * table integrity.  
>  * 
>  * Region consistency checks verify that META, region deployment on
>  * region servers and the state of data in HDFS (.regioninfo files) all are in
>  * accordance. 
>  * 
>  * Table integrity checks verify that that all possible row keys can resolve to
>  * exactly one region of a table.  This means there are no individual degenerate
>  * or backwards regions; no holes between regions; and that there no overlapping
>  * regions. 
>  * 
>  * The general repair strategy works in these steps.
>  * 1) Repair Table Integrity on HDFS. (merge or fabricate regions)
>  * 2) Repair Region Consistency with META and assignments
>  * 
>  * For table integrity repairs, the tables their region directories are scanned
>  * for .regioninfo files.  Each table's integrity is then verified.  If there 
>  * are any orphan regions (regions with no .regioninfo files), or holes, new 
>  * regions are fabricated.  Backwards regions are sidelined as well as empty
>  * degenerate (endkey==startkey) regions.  If there are any overlapping regions,
>  * a new region is created and all data is merged into the new region.  
>  * 
>  * Table integrity repairs deal solely with HDFS and can be done offline -- the
>  * hbase region servers or master do not need to be running.  These phase can be
>  * use to completely reconstruct the META table in an offline fashion. 
>  * 
>  * Region consistency requires three conditions -- 1) valid .regioninfo file 
>  * present in an hdfs region dir,  2) valid row with .regioninfo data in META,
>  * and 3) a region is deployed only at the regionserver that is was assigned to.
>  * 
>  * Region consistency requires hbck to contact the HBase master and region
>  * servers, so the connect() must first be called successfully.  Much of the
>  * region consistency information is transient and less risky to repair.
>  */
> {code}

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira


View raw message