Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id CFB759219 for ; Tue, 20 Mar 2012 02:38:06 +0000 (UTC) Received: (qmail 51115 invoked by uid 500); 20 Mar 2012 02:38:06 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 49919 invoked by uid 500); 20 Mar 2012 02:38:04 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 49616 invoked by uid 99); 20 Mar 2012 02:38:02 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Mar 2012 02:38:02 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=5.0 tests=ALL_TRUSTED,T_RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 20 Mar 2012 02:38:01 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 181F1296ED for ; Tue, 20 Mar 2012 02:37:41 +0000 (UTC) Date: Tue, 20 Mar 2012 02:37:41 +0000 (UTC) From: "jiraposter@reviews.apache.org (Commented) (JIRA)" To: issues@hbase.apache.org Message-ID: <662195894.34607.1332211061100.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <1546527238.7893.1325731119241.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HBASE-5128) [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online. MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-5128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13233141#comment-13233141 ] jiraposter@reviews.apache.org commented on HBASE-5128: ------------------------------------------------------ bq. On 2012-03-11 01:25:43, Ted Yu wrote: bq. > src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java, line 2652 bq. > bq. > bq. > Can we deprecate this method in 0.94 and remove it in 0.96 ? Completed in HBASE-5588. - jmhsieh ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/4280/#review5823 ----------------------------------------------------------- On 2012-03-10 01:04:58, jmhsieh wrote: bq. bq. ----------------------------------------------------------- bq. This is an automatically generated e-mail. To reply, visit: bq. https://reviews.apache.org/r/4280/ bq. ----------------------------------------------------------- bq. bq. (Updated 2012-03-10 01:04:58) bq. bq. bq. Review request for hbase, Todd Lipcon, Ted Yu, and Lars Hofhansl. bq. bq. bq. Summary bq. ------- bq. bq. This version is similar to the 0.90.x version posted a few months back, but has a few new features and some minor differences. bq. bq. 1) No trackHTD method needed since we can read from the file system. bq. 2) Added safeguards to prevent mega merges, and to isolate repairs to particular tables. bq. 3) Fixed comparator in HRegionInfo bq. 4) Fixed TestRegionObserverInterface so that it doesn't rely on bug in HRegionInfo comparator. bq. bq. I'll backport to 0.94/0.92 (which should be very similar) and update the 0.90 versions after this patch has mostly cleared. bq. bq. This version is not perfect (there are definitely cases not covered) but it think it is worth trying to get this in so that future reviews are more manageable. bq. bq. bq. This addresses bug HBASE-5128. bq. https://issues.apache.org/jira/browse/HBASE-5128 bq. bq. bq. Diffs bq. ----- bq. bq. src/main/java/org/apache/hadoop/hbase/HRegionInfo.java 98f79fc bq. src/main/java/org/apache/hadoop/hbase/ipc/HMasterInterface.java 3bcf899 bq. src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java ae468ca bq. src/main/java/org/apache/hadoop/hbase/master/HMaster.java e2bbbd0 bq. src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java 720841c bq. src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java 5916d9c bq. src/main/java/org/apache/hadoop/hbase/util/hbck/OfflineMetaRepair.java d57bb6b bq. src/main/java/org/apache/hadoop/hbase/util/hbck/TableIntegrityErrorHandler.java PRE-CREATION bq. src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java 38eb6a8 bq. src/test/java/org/apache/hadoop/hbase/coprocessor/TestRegionObserverInterface.java 1b3b6df bq. src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsck.java 937781d bq. src/test/java/org/apache/hadoop/hbase/util/TestHBaseFsckComparator.java 0599da1 bq. src/test/java/org/apache/hadoop/hbase/util/hbck/HbckTestingUtil.java dbb97f8 bq. src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildBase.java 2b4cac8 bq. src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildHole.java ebbeead bq. src/test/java/org/apache/hadoop/hbase/util/hbck/TestOfflineMetaRebuildOverlap.java b175548 bq. bq. Diff: https://reviews.apache.org/r/4280/diff bq. bq. bq. Testing bq. ------- bq. bq. Unit tests cover many many situations and pass. Most "live" testing has been done on 0.90.x versions. Many improvements and features added from experience. Not much testing live on the trunk versions. bq. bq. bq. Thanks, bq. bq. jmhsieh bq. bq. > [uber hbck] Enable hbck to automatically repair table integrity problems as well as region consistency problems while online. > ----------------------------------------------------------------------------------------------------------------------------- > > Key: HBASE-5128 > URL: https://issues.apache.org/jira/browse/HBASE-5128 > Project: HBase > Issue Type: New Feature > Components: hbck > Affects Versions: 0.90.5, 0.92.0, 0.94.0, 0.96.0 > Reporter: Jonathan Hsieh > Assignee: Jonathan Hsieh > Attachments: hbase-5128-trunk.patch > > > The current (0.90.5, 0.92.0rc2) versions of hbck detects most of region consistency and table integrity invariant violations. However with '-fix' it can only automatically repair region consistency cases having to do with deployment problems. This updated version should be able to handle all cases (including a new orphan regiondir case). When complete will likely deprecate the OfflineMetaRepair tool and subsume several open META-hole related issue. > Here's the approach (from the comment of at the top of the new version of the file). > {code} > /** > * HBaseFsck (hbck) is a tool for checking and repairing region consistency and > * table integrity. > * > * Region consistency checks verify that META, region deployment on > * region servers and the state of data in HDFS (.regioninfo files) all are in > * accordance. > * > * Table integrity checks verify that that all possible row keys can resolve to > * exactly one region of a table. This means there are no individual degenerate > * or backwards regions; no holes between regions; and that there no overlapping > * regions. > * > * The general repair strategy works in these steps. > * 1) Repair Table Integrity on HDFS. (merge or fabricate regions) > * 2) Repair Region Consistency with META and assignments > * > * For table integrity repairs, the tables their region directories are scanned > * for .regioninfo files. Each table's integrity is then verified. If there > * are any orphan regions (regions with no .regioninfo files), or holes, new > * regions are fabricated. Backwards regions are sidelined as well as empty > * degenerate (endkey==startkey) regions. If there are any overlapping regions, > * a new region is created and all data is merged into the new region. > * > * Table integrity repairs deal solely with HDFS and can be done offline -- the > * hbase region servers or master do not need to be running. These phase can be > * use to completely reconstruct the META table in an offline fashion. > * > * Region consistency requires three conditions -- 1) valid .regioninfo file > * present in an hdfs region dir, 2) valid row with .regioninfo data in META, > * and 3) a region is deployed only at the regionserver that is was assigned to. > * > * Region consistency requires hbck to contact the HBase master and region > * servers, so the connect() must first be called successfully. Much of the > * region consistency information is transient and less risky to repair. > */ > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira