Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id BE0C6D76E for ; Tue, 26 Feb 2013 06:40:14 +0000 (UTC) Received: (qmail 47564 invoked by uid 500); 26 Feb 2013 06:40:14 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 47304 invoked by uid 500); 26 Feb 2013 06:40:14 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 47285 invoked by uid 99); 26 Feb 2013 06:40:14 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 26 Feb 2013 06:40:14 +0000 Date: Tue, 26 Feb 2013 06:40:14 +0000 (UTC) From: "rajeshbabu (JIRA)" To: issues@hbase.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (HBASE-6469) Failure on enable/disable table will cause table state in zk to be left as enabling/disabling until master is restarted MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-6469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13586855#comment-13586855 ] rajeshbabu commented on HBASE-6469: ----------------------------------- @Lars, bq.If a background can continue the transaction, why can't another (user initiated) attempt to either enable or disable the table? Multiple calls of enable/disable may cause inconsistencies. Here is the comment from EnableTableHandler/DisableTableHandler constructor. {code} // There could be multiple client requests trying to disable or enable // the table at the same time. Ensure only the first request is honored // After that, no other requests can be accepted until the table reaches // DISABLED or ENABLED. {code} bq.The expectation here is that a user won't care what care what state the table is currently in. If a disable is triggered the result should be a disabled table (and similar for an enable). I agree but user need to know about the current state of assignments/unassignments(as of now) before calling force enable/disable. bq.Can we just make so that we allow transitioning from ENABLING to both ENABLED or DISABLED and from DISABLING to both ENABLED or DISABLED? If we transition from ENABLING to DISABLED then the regions which are in transtion or yet to start assignment wont be assigned. Some times even assigned regions will be unassigned. Below are the some code snippets which are used to handle inconsistencies with disable and balance race. {code} private void assign(RegionState state, final boolean setOfflineInZK, final boolean forceNewPlan) { .... if (isDisabledorDisablingRegionInRIT(region)) { return; } ... } {code} After completion of rit transition we will check once more that table is disabling or disabled {code} boolean disabled = getZKTable().isDisablingOrDisabledTable( regionInfo.getTableNameAsString()); if (!serverManager.isServerOnline(serverName) && !disabled) { LOG.info("Opened region " + regionNameStr + "but the region server is offline, reassign the region"); assign(regionInfo, true); } else if (disabled) { // if server is offline, no hurt to unassign again LOG.info("Opened region " + regionNameStr + "but this table is disabled, triggering close of region"); unassign(regionInfo); } {code} During master restart we will skip regions assignment of disabling/disabled tables(This case is if master restarted after transition). {code} Set disabledOrDisablingOrEnabling = ZKTable.getDisabledOrDisablingTables(watcher); disabledOrDisablingOrEnabling.addAll(ZKTable.getEnablingTables(watcher)); // Scan META for all user regions, skipping any disabled tables Map allRegions = MetaReader.fullScan( catalogTracker, disabledOrDisablingOrEnabling, true); {code} Exactly reverse(assign instead of unassign) will happen if we transition from DISABLING to ENABLED. code snippet from ClosedRegionHandler.process() {code} if (this.assignmentManager.getZKTable(). isDisablingOrDisabledTable(this.regionInfo.getTableNameAsString())) { assignmentManager.offlineDisabledRegion(regionInfo); return; } // ZK Node is in CLOSED state, assign it. assignmentManager.getRegionStates().updateRegionState( regionInfo, RegionState.State.CLOSED, null); // This below has to do w/ online enable/disable of a table assignmentManager.removeClosedRegion(regionInfo); assignmentManager.assign(regionInfo, true); {code} Please correct me if I am wrong. Thanks Lars. > Failure on enable/disable table will cause table state in zk to be left as enabling/disabling until master is restarted > ----------------------------------------------------------------------------------------------------------------------- > > Key: HBASE-6469 > URL: https://issues.apache.org/jira/browse/HBASE-6469 > Project: HBase > Issue Type: Bug > Affects Versions: 0.94.2, 0.96.0 > Reporter: Enis Soztutar > Assignee: rajeshbabu > Fix For: 0.96.0, 0.94.7 > > Attachments: 6469-expose-force-r3.patch, HBASE-6469_2.patch, HBASE-6469_3.patch, HBASE-6469.patch > > > In Enable/DisableTableHandler code, if something goes wrong in handling, the table state in zk is left as ENABLING / DISABLING. After that we cannot force any more action from the API or CLI, and the only recovery path is restarting the master. > {code} > if (done) { > // Flip the table to enabled. > this.assignmentManager.getZKTable().setEnabledTable( > this.tableNameStr); > LOG.info("Table '" + this.tableNameStr > + "' was successfully enabled. Status: done=" + done); > } else { > LOG.warn("Table '" + this.tableNameStr > + "' wasn't successfully enabled. Status: done=" + done); > } > {code} > Here, if done is false, the table state is not changed. There is also no way to set skipTableStateCheck from cli / api. > We have run into this issue a couple of times before. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira