hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Todd Lipcon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-4288) "Server not running" exception during meta verification causes RS abort
Date Tue, 30 Aug 2011 00:52:37 GMT

    [ https://issues.apache.org/jira/browse/HBASE-4288?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13093343#comment-13093343
] 

Todd Lipcon commented on HBASE-4288:
------------------------------------

This is a little tricky. I don't think we can just catch this and return false, since there's
no verification that the server is dead _yet_, just that it's shutting down. If we were to
return false, then the users of this code would delete the root location from ZK and start
re-assigning even though the old server may have unflushed edits, etc.

Though, this makes me think: why is it _ever_ safe to delete the root location and reassign
it before the old location's logs have split?

> "Server not running" exception during meta verification causes RS abort
> -----------------------------------------------------------------------
>
>                 Key: HBASE-4288
>                 URL: https://issues.apache.org/jira/browse/HBASE-4288
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.90.4
>            Reporter: Todd Lipcon
>            Priority: Critical
>             Fix For: 0.92.0, 0.90.5
>
>
> The master tried to verify the META location just as that server was shutting down due
to an abort. This caused the "Server not running" exception to get thrown, which wasn't handled
properly in the master, causing it to abort.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message