hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Yu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7715) FSUtils#waitOnSafeMode can incorrectly loop on standby NN
Date Wed, 30 Jan 2013 15:09:13 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13566527#comment-13566527

Ted Yu commented on HBASE-7715:

"pool-1-thread-1" prio=10 tid=0x09cd1400 nid=0x2ea7 in Object.wait() [0x772dc000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0xd3914960> (a org.apache.hadoop.hbase.util.JVMClusterUtil$RegionServerThread)
	at java.lang.Thread.join(Thread.java:1194)
	- locked <0xd3914960> (a org.apache.hadoop.hbase.util.JVMClusterUtil$RegionServerThread)
	at org.apache.hadoop.hbase.util.JVMClusterUtil.shutdown(JVMClusterUtil.java:249)
	at org.apache.hadoop.hbase.LocalHBaseCluster.shutdown(LocalHBaseCluster.java:430)
	at org.apache.hadoop.hbase.MiniHBaseCluster.shutdown(MiniHBaseCluster.java:501)
	at org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniHBaseCluster(HBaseTestingUtility.java:893)
	at org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniCluster(HBaseTestingUtility.java:863)
	at org.apache.hadoop.hbase.TestRegionRebalancing.after(TestRegionRebalancing.java:77)
I ran TestRegionRebalancing and it passed locally.
> FSUtils#waitOnSafeMode can incorrectly loop on standby NN
> ---------------------------------------------------------
>                 Key: HBASE-7715
>                 URL: https://issues.apache.org/jira/browse/HBASE-7715
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.94.4
>            Reporter: Andrew Wang
>            Assignee: Ted Yu
>             Fix For: 0.96.0
>         Attachments: 7715-trunk-v2.txt, 7715-trunk-v3.txt
> We encountered an issue where HMaster failed to start with an active NN not in safe mode
and a standby NN in safemode. The relevant lines in {{FSUtils.java}} show the issue:
> {noformat}
>     while (dfs.setSafeMode(org.apache.hadoop.hdfs.protocol.FSConstants.SafeModeAction.SAFEMODE_GET))
> {noformat}
> This call skips the normal client failover from the standby to active NN, so it will
loop polling the standby NN if it unfortunately talks to the standby first.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

View raw message