Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id DE55617543 for ; Thu, 20 Nov 2014 22:20:34 +0000 (UTC) Received: (qmail 42543 invoked by uid 500); 20 Nov 2014 22:20:34 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 42484 invoked by uid 500); 20 Nov 2014 22:20:34 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 42472 invoked by uid 99); 20 Nov 2014 22:20:34 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 20 Nov 2014 22:20:34 +0000 Date: Thu, 20 Nov 2014 22:20:34 +0000 (UTC) From: "Andrew Purtell (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-10955) HBCK leaves the region in masters in-memory RegionStates if region hdfs dir is lost MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-10955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14220121#comment-14220121 ] Andrew Purtell commented on HBASE-10955: ---------------------------------------- Mostly whitespace changes and a new test with these key changes: First: {code} --- hbase-server/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java +++ hbase-server/src/main/java/org/apache/hadoop/hbase/util/HBaseFsck.java @@ -1862,7 +1863,7 @@ public class HBaseFsck extends Configured { // these problems from META. if (shouldFixAssignments()) { errors.print("Trying to fix unassigned region..."); - closeRegion(hbi);// Close region will cause RS to abort. + undeployRegions(hbi); } if (shouldFixMeta()) { // wait for it to complete {code} undeployRegions(hbi) instead of closeRegion(hbi). I don't know AM subtleties enough to say for sure. Maybe [~jxiang] or [~virag] could say? Second: {code} --- hbase-server/src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java +++ hbase-server/src/main/java/org/apache/hadoop/hbase/util/HBaseFsckRepair.java @@ -40,7 +40,6 @@ import org.apache.hadoop.hbase.master.RegionState; import org.apache.hadoop.hbase.protobuf.ProtobufUtil; import org.apache.hadoop.hbase.protobuf.generated.AdminProtos.AdminService; import org.apache.hadoop.hbase.regionserver.HRegion; -import org.apache.hadoop.hbase.regionserver.wal.HLog; import org.apache.zookeeper.KeeperException; /** @@ -187,12 +186,10 @@ public class HBaseFsckRepair { HRegionInfo hri, HTableDescriptor htd) throws IOException { // Create HRegion Path root = FSUtils.getRootDir(conf); - HRegion region = HRegion.createHRegion(hri, root, conf, htd); - HLog hlog = region.getLog(); + HRegion region = HRegion.createHRegion(hri, root, conf, htd, null); // Close the new region to flush to disk. Close log file too. region.close(); - hlog.closeAndDelete(); return region; } } {code} This second hunk seems fine but I'd say no without a better understanding of the first change. > HBCK leaves the region in masters in-memory RegionStates if region hdfs dir is lost > ----------------------------------------------------------------------------------- > > Key: HBASE-10955 > URL: https://issues.apache.org/jira/browse/HBASE-10955 > Project: HBase > Issue Type: Bug > Reporter: Enis Soztutar > Assignee: Enis Soztutar > Fix For: 0.99.0 > > Attachments: hbase-10955.v1.patch > > > One of our tests removes the hdfs directory for the region, and invokes HBCK to fix the issue. This test fails flakily because the region is removed from meta and unassigned, but the region is not offlined from the masters in-memory. This affects further LB runs and disable table, etc. > In case of {{inMeta && !inHdfs && isDeployed}}, we should not just close the region from RS, but call master.unassign(). -- This message was sent by Atlassian JIRA (v6.3.4#6332)