Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 86EE2717A for ; Fri, 2 Sep 2011 22:38:34 +0000 (UTC) Received: (qmail 87122 invoked by uid 500); 2 Sep 2011 22:38:34 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 86988 invoked by uid 500); 2 Sep 2011 22:38:33 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 86974 invoked by uid 99); 2 Sep 2011 22:38:33 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 Sep 2011 22:38:33 +0000 X-ASF-Spam-Status: No, hits=-2000.5 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 Sep 2011 22:38:31 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 136C44CCA0 for ; Fri, 2 Sep 2011 22:38:10 +0000 (UTC) Date: Fri, 2 Sep 2011 22:38:10 +0000 (UTC) From: "Ted Yu (JIRA)" To: issues@hbase.apache.org Message-ID: <664988962.13095.1315003090076.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <2083326844.1318.1314554080054.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HBASE-4265) zookeeper.KeeperException$NodeExistsException if HMaster restarts while table is being disabled MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/jira/browse/HBASE-4265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13096441#comment-13096441 ] Ted Yu commented on HBASE-4265: ------------------------------- +1 on patch. > zookeeper.KeeperException$NodeExistsException if HMaster restarts while table is being disabled > ----------------------------------------------------------------------------------------------- > > Key: HBASE-4265 > URL: https://issues.apache.org/jira/browse/HBASE-4265 > Project: HBase > Issue Type: Bug > Reporter: Ming Ma > Assignee: Ming Ma > Fix For: 0.92.0 > > > There seems to be more than just one issue regarding the following scenario. I would provide a fix later just for this exception. > 1. A table is being disabled. > 2. HMaster restarted. > 3. At HMaster startup, it tries to transition from disabling to disabled state. It got the following exception. > org.apache.zookeeper.KeeperException$NodeExistsException: KeeperErrorCode = NodeExists for /hbase/unassigned/419b902243c836c285108ba555b712fa > at org.apache.zookeeper.KeeperException.create(KeeperException.java:110) > at org.apache.zookeeper.KeeperException.create(KeeperException.java:42) > at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:637) > at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.createNonSequential(RecoverableZooKeeper.java:475) > at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.create(RecoverableZooKeeper.java:457) > at org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndWatch(ZKUtil.java:742) > at org.apache.hadoop.hbase.zookeeper.ZKAssign.createNodeClosing(ZKAssign.java:461) > at org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1440) > at org.apache.hadoop.hbase.master.AssignmentManager.unassign(AssignmentManager.java:1406) > at org.apache.hadoop.hbase.master.handler.DisableTableHandler$BulkDisabler$1.run(DisableTableHandler.java:141) > at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) > at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) > This issue is this specific region is in a special state before HMaster restarts; it has been closed by RS properly thus the zk state is RS_ZK_REGION_CLOSED. However, HMaster hasn't got a chance to process ClosedRegionHandler yet and thus the node remains at zk. After RS restarts, this node is added to online region list first in AssignmentManager.rebuildUserRegions and tries to unassign it later. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira