hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Andrew Wang (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-7715) FSUtils#waitOnSafeMode can incorrectly loop on standby NN
Date Wed, 30 Jan 2013 05:07:13 GMT

    [ https://issues.apache.org/jira/browse/HBASE-7715?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13566167#comment-13566167
] 

Andrew Wang commented on HBASE-7715:
------------------------------------

Cool patch, Ted!

There's also one more usage of setSafeMode in FSUtils#checkDfsSafeMode that should probably
be fixed. Uma's comment about using #isInSafeMode() is also probably a bit cleaner.

I don't think that the StandbyException is actually propagated up this high, it gets caught
and used in the failover logic in DFSClient. You can check me on that one though.

I'd also prefer to see an explicit catch of the {{NoSuchMethod}} exception (and a comment
denoting why we're doing all this this business), rather than a generic {{Exception}} catch.
Then you can avoid rethrowing in the catch.
                
> FSUtils#waitOnSafeMode can incorrectly loop on standby NN
> ---------------------------------------------------------
>
>                 Key: HBASE-7715
>                 URL: https://issues.apache.org/jira/browse/HBASE-7715
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.94.4
>            Reporter: Andrew Wang
>            Assignee: Ted Yu
>             Fix For: 0.96.0
>
>         Attachments: 7715-trunk-v2.txt
>
>
> We encountered an issue where HMaster failed to start with an active NN not in safe mode
and a standby NN in safemode. The relevant lines in {{FSUtils.java}} show the issue:
> {noformat}
>     while (dfs.setSafeMode(org.apache.hadoop.hdfs.protocol.FSConstants.SafeModeAction.SAFEMODE_GET))
{
> {noformat}
> This call skips the normal client failover from the standby to active NN, so it will
loop polling the standby NN if it unfortunately talks to the standby first.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message