hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Pankaj Kumar (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-14000) Region server failed to report Master and stuck in reportForDuty retry loop
Date Tue, 30 Jun 2015 12:54:04 GMT

     [ https://issues.apache.org/jira/browse/HBASE-14000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Pankaj Kumar updated HBASE-14000:
---------------------------------
    Description: 
In a HA cluster, region server got stuck in reportForDuty retry loop if the active master
is restarting and later on master switch happens before it reports successfully.

Root cause is same as HBASE-13317, but the region server tried to connect master when it was
starting, so rssStub reset didnt happen as
{code}
  if (ioe instanceof ServerNotRunningYetException) {
	LOG.debug("Master is not running yet");
  }
{code}
When master starts, master switch happened. So RS always tried to connect to standby master.

  was:
In a HA cluster, region server got stuck in reportForDuty retry loop if the active master
is restarting and later on master switch happens before it reports successfully.

Root cause is same as HBASE-13317, but the region server tried to connect master when it was
starting, so rssStub reset didnt happen as
{code}
  if (ioe instanceof ServerNotRunningYetException) {
	LOG.debug("Master is not running yet");
  }
{code}



> Region server failed to report Master and stuck in reportForDuty retry loop
> ---------------------------------------------------------------------------
>
>                 Key: HBASE-14000
>                 URL: https://issues.apache.org/jira/browse/HBASE-14000
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Pankaj Kumar
>            Assignee: Pankaj Kumar
>         Attachments: HBASE-14000.patch
>
>
> In a HA cluster, region server got stuck in reportForDuty retry loop if the active master
is restarting and later on master switch happens before it reports successfully.
> Root cause is same as HBASE-13317, but the region server tried to connect master when
it was starting, so rssStub reset didnt happen as
> {code}
>   if (ioe instanceof ServerNotRunningYetException) {
> 	LOG.debug("Master is not running yet");
>   }
> {code}
> When master starts, master switch happened. So RS always tried to connect to standby
master.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message