hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Stephen Yuan Jiang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-13935) Orphaned namespace table ZK node should not prevent master to start
Date Thu, 18 Jun 2015 18:53:01 GMT

    [ https://issues.apache.org/jira/browse/HBASE-13935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14592308#comment-14592308

Stephen Yuan Jiang commented on HBASE-13935:

[~mbertozzi], The failed server was gone.  Before the patch, it would fail if table is either
if (!assignmentManager.getTableStateManager().setTableStateIfNotInStates(tableName,
        ZooKeeperProtos.Table.State.ENABLED)) {
        throw new TableExistsException(tableName);

If we have an orphaned ENABLING znode, before HMaster#initNamespace() was called, "this.assignmentManager.joinCluster();"
was executed, which would call "AssignmentManager#recoverTableInEnablingState()" to remove
the ENABLING znode.  That is why my unit test only set to ENABLED and my guess is the orphaned
znode in the test probably has ENABLED znode.  

[~mbertozzi] I thought this would not be a problem with PV2; however, we hit this twice with
PV2 enabled in branch-1.1 testing a couple of weeks ago (HBASE-13815 - originally I thought
the rollback had some flaw, but carefully examined code and I think rollback is correct).
 I applied the same skip logic locally and we never see this problem again in branch-1.1 testing.

> Orphaned namespace table ZK node should not prevent master to start
> -------------------------------------------------------------------
>                 Key: HBASE-13935
>                 URL: https://issues.apache.org/jira/browse/HBASE-13935
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 1.0.0, 0.98.13
>            Reporter: Stephen Yuan Jiang
>            Assignee: Stephen Yuan Jiang
>             Fix For: 0.98.14, 1.0.2
>         Attachments: HBASE-13935.v1-0.98.patch, HBASE-13935.v1-branch-1.0.patch
> Before we have the state-of-art Procedure V2 feature (HBASE 1.0 release or older), we
frequently see the following issue (orphaned ZK node) that prevent master to start (at least
in testing):
> {noformat}
> 2015-06-16 17:54:36,472 FATAL [master:] master.HMaster: Unhandled exception.
Starting shutdown.
> org.apache.hadoop.hbase.TableExistsException: hbase:namespace
> 	at org.apache.hadoop.hbase.master.handler.CreateTableHandler.prepare(CreateTableHandler.java:137)
> 	at org.apache.hadoop.hbase.master.TableNamespaceManager.createNamespaceTable(TableNamespaceManager.java:232)
> 	at org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:86)
> 	at org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:1123)
> 	at org.apache.hadoop.hbase.master.HMaster.finishInitialization(HMaster.java:947)
> 	at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:618)
> 	at java.lang.Thread.run(Thread.java:745)
> 2015-06-16 17:54:36,472 INFO  [master:] master.HMaster: Aborting
> {noformat}
> The above call trace is from a 0.98.x test run.  We saw similar issue in 1.0.x run, too.
> The proposed fix is to ignore the zk node and force namespace table creation to be complete
so that master can start successfully.  

This message was sent by Atlassian JIRA

View raw message