hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pankaj Kumar (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-14000) Region server failed to report Master and stuck in reportForDuty retry loop
Date Wed, 08 Jul 2015 15:59:06 GMT

    [ https://issues.apache.org/jira/browse/HBASE-14000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14618818#comment-14618818

Pankaj Kumar commented on HBASE-14000:

Thanks [~jerryhe] for looking on this, ServerNotRunningYetException reported when master (HM1)
was initializing, but by the time master (HM1) finish initialization another master (HM2)
became active. 
Since rssStub is referring to the master (HM1) which is in standby mode now, so region server
stuck in loop and always trying to connect to the standby master (HM1). 

> Region server failed to report Master and stuck in reportForDuty retry loop
> ---------------------------------------------------------------------------
>                 Key: HBASE-14000
>                 URL: https://issues.apache.org/jira/browse/HBASE-14000
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Pankaj Kumar
>            Assignee: Pankaj Kumar
>         Attachments: HBASE-14000.patch
> In a HA cluster, region server got stuck in reportForDuty retry loop if the active master
is restarting and later on master switch happens before it reports successfully.
> Root cause is same as HBASE-13317, but the region server tried to connect master when
it was starting, so rssStub reset didnt happen as
> {code}
>   if (ioe instanceof ServerNotRunningYetException) {
> 	LOG.debug("Master is not running yet");
>   }
> {code}
> When master starts, master switch happened. So RS always tried to connect to standby

This message was sent by Atlassian JIRA

View raw message