Return-Path: Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: (qmail 89939 invoked from network); 6 Nov 2010 05:03:14 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 6 Nov 2010 05:03:14 -0000 Received: (qmail 67740 invoked by uid 500); 6 Nov 2010 05:03:45 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 67665 invoked by uid 500); 6 Nov 2010 05:03:44 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 67657 invoked by uid 99); 6 Nov 2010 05:03:43 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 06 Nov 2010 05:03:43 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.22] (HELO thor.apache.org) (140.211.11.22) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 06 Nov 2010 05:03:43 +0000 Received: from thor (localhost [127.0.0.1]) by thor.apache.org (8.13.8+Sun/8.13.8) with ESMTP id oA653MXo016677 for ; Sat, 6 Nov 2010 05:03:22 GMT Message-ID: <28426215.49961289019802473.JavaMail.jira@thor> Date: Sat, 6 Nov 2010 01:03:22 -0400 (EDT) From: "stack (JIRA)" To: issues@hbase.apache.org Subject: [jira] Commented: (HBASE-3112) Enable and disable of table needs a bit of loving in new master In-Reply-To: <22358713.145081287076713120.JavaMail.jira@thor> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928932#action_12928932 ] stack commented on HBASE-3112: ------------------------------ Currently enable/disable runs synchronously and serially in that the enable/disable handler runs each table regions unassign one after the other. This is at a minimum slow when it doesn't have to be. What if we changed it so it was async. Here is how it might work. * Client invokes disable table on master ** Disable invocation returns immediately ** We add a new is_disabled/is_enabled call to client so client can check on state of enable/disable. * Master queues a DisableTableHandler. ** DTH checks if table already disabled, if so returns ** Otherwise, sets disabling flag up in zk then loops on *** Getting all regions in a table from meta *** Per region, checks if in RIT or not already offline, and if not queues close region executor *** Waits around till all table regions clear RIT ** On exit from the loop, it sets up in zk that table is disabled. Same for ETH If master dies midway, new master will start up a DTH or ETH per table that has disabling or enabling up in zk. DTH/ETH must be made idempotent because I think there are situations even still in which they might fail. If they fail, user might reschedule the disable/enable which would start up a new DTH (if already a DTH queued and running, they won't clash since only one thread for this exectuor and the second will just early out because when it checks flag in zk it'll be disabled so it'll have no work to do). > Enable and disable of table needs a bit of loving in new master > --------------------------------------------------------------- > > Key: HBASE-3112 > URL: https://issues.apache.org/jira/browse/HBASE-3112 > Project: HBase > Issue Type: Bug > Reporter: stack > Assignee: stack > Priority: Critical > Fix For: 0.90.0 > > > The tools are in place to do a more reliable enable/disable of tables. Some work has been done to hack in a basic enable/disable but its not enough -- see the test avro/thrift tests where a disable/enable/disable switchback can confuse the table state (and has been disabled until this issue addressed). > This issue is about finishing off enable/disable in the new master. I think we need to add to the table znode an enabling/disabling state rather than have them binary with a watcher that will stop an enable (or disable) starting until the previous completes (Currently we atomically switch the state though the region close/open lags -- some work in enable/disable handlers helps in that they won't complete till all regions have transitioned.. but its not enough). > Need to add tests too. > Marking issue critical bug because loads of the questions we get on lists are about enable/disable probs. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.