hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Yu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-8519) Backup master will never come up if primary master dies during initialization
Date Wed, 15 May 2013 22:49:19 GMT

    [ https://issues.apache.org/jira/browse/HBASE-8519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13658952#comment-13658952
] 

Ted Yu commented on HBASE-8519:
-------------------------------

I wonder if the following test failure in EC2 HBase-TRUNK #286 was related to this patch:

Tests in error:
  testMasterShutdownBeforeStartingAnyRegionServer(org.apache.hadoop.hbase.master.TestMasterShutdown):
test timed out after 180000 milliseconds

                
> Backup master will never come up if primary master dies during initialization
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-8519
>                 URL: https://issues.apache.org/jira/browse/HBASE-8519
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.94.7, 0.95.0
>            Reporter: Jerry He
>            Assignee: Jerry He
>            Priority: Minor
>             Fix For: 0.98.0, 0.95.1
>
>         Attachments: HBASE-8519-trunk.patch, HBASE-8519-trunk-v2.patch
>
>
> The problem happens if primary master dies after becoming master but before it completes
initialization and calls clusterStatusTracker.setClusterUp(),
> The backup master will try to become the master, but will shutdown itself promptly because
it sees 'the cluster is not up'.
> This is the backup master log:
> 2013-05-09 15:08:05,568 INFO org.apache.hadoop.hbase.master.metrics.MasterMetrics: Initialized
> 2013-05-09 15:08:05,573 DEBUG org.apache.hadoop.hbase.master.HMaster: HMaster started
in backup mode.  Stalling until master znode is written.
> 2013-05-09 15:08:05,589 INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper:
Node /hbase/master already exists and this is not a retry
> 2013-05-09 15:08:05,590 INFO org.apache.hadoop.hbase.master.ActiveMasterManager: Adding
ZNode for /hbase/backup-masters/xxx.com,60000,1368137285373 in backup master directory
> 2013-05-09 15:08:05,595 INFO org.apache.hadoop.hbase.master.ActiveMasterManager: Another
master is the active master, xxx.com,60000,1368137283107; waiting to become the next active
master
> 2013-05-09 15:09:45,006 DEBUG org.apache.hadoop.hbase.master.ActiveMasterManager: No
master available. Notifying waiting threads
> 2013-05-09 15:09:45,006 INFO org.apache.hadoop.hbase.master.HMaster: Cluster went down
before this master became active
> 2013-05-09 15:09:45,006 DEBUG org.apache.hadoop.hbase.master.HMaster: Stopping service
threads
> 2013-05-09 15:09:45,006 INFO org.apache.hadoop.ipc.HBaseServer: Stopping server on 60000
>  
> In ActiveMasterManager::blockUntilBecomingActiveMaster()
> {code}
>   ..
>   if (!clusterStatusTracker.isClusterUp()) {
>           this.master.stop(
>             "Cluster went down before this master became active");
>         }
>   ..
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message