hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Ted Yu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7715) FSUtils#waitOnSafeMode can incorrectly loop on standby NN
Date Wed, 30 Jan 2013 15:09:13 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13566527#comment-13566527
] 

Ted Yu commented on HBASE-7715:
-------------------------------

{code}
"pool-1-thread-1" prio=10 tid=0x09cd1400 nid=0x2ea7 in Object.wait() [0x772dc000]
   java.lang.Thread.State: TIMED_WAITING (on object monitor)
	at java.lang.Object.wait(Native Method)
	- waiting on <0xd3914960> (a org.apache.hadoop.hbase.util.JVMClusterUtil$RegionServerThread)
	at java.lang.Thread.join(Thread.java:1194)
	- locked <0xd3914960> (a org.apache.hadoop.hbase.util.JVMClusterUtil$RegionServerThread)
	at org.apache.hadoop.hbase.util.JVMClusterUtil.shutdown(JVMClusterUtil.java:249)
	at org.apache.hadoop.hbase.LocalHBaseCluster.shutdown(LocalHBaseCluster.java:430)
	at org.apache.hadoop.hbase.MiniHBaseCluster.shutdown(MiniHBaseCluster.java:501)
	at org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniHBaseCluster(HBaseTestingUtility.java:893)
	at org.apache.hadoop.hbase.HBaseTestingUtility.shutdownMiniCluster(HBaseTestingUtility.java:863)
	at org.apache.hadoop.hbase.TestRegionRebalancing.after(TestRegionRebalancing.java:77)
{code}
I ran TestRegionRebalancing and it passed locally.
                
> FSUtils#waitOnSafeMode can incorrectly loop on standby NN
> ---------------------------------------------------------
>
>                 Key: HBASE-7715
>                 URL: https://issues.apache.org/jira/browse/HBASE-7715
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.94.4
>            Reporter: Andrew Wang
>            Assignee: Ted Yu
>             Fix For: 0.96.0
>
>         Attachments: 7715-trunk-v2.txt, 7715-trunk-v3.txt
>
>
> We encountered an issue where HMaster failed to start with an active NN not in safe mode
and a standby NN in safemode. The relevant lines in {{FSUtils.java}} show the issue:
> {noformat}
>     while (dfs.setSafeMode(org.apache.hadoop.hdfs.protocol.FSConstants.SafeModeAction.SAFEMODE_GET))
{
> {noformat}
> This call skips the normal client failover from the standby to active NN, so it will
loop polling the standby NN if it unfortunately talks to the standby first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message