hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "chunhui shen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-5875) Process RIT and Master restart may remove an online server considering it as a dead server
Date Wed, 02 May 2012 08:58:50 GMT

    [ https://issues.apache.org/jira/browse/HBASE-5875?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13266446#comment-13266446
] 

chunhui shen commented on HBASE-5875:
-------------------------------------

@ram
I'm clear now about the time gap.

What about do the following check 
{code}
if (assignmentManager.getRegionServerOfRegion(HRegionInfo.ROOT_REGIONINFO) == null) {
      ServerName currentRootServer = null;
      if (!catalogTracker.verifyRootRegionLocation(timeout)) {
        currentRootServer = this.catalogTracker.getRootLocation();
        splitLogAndExpireIfOnline(currentRootServer);
        this.assignmentManager.assignRoot();
        // Make sure a -ROOT- location is set.
        if (!isRootLocation())
          return false;
        // This guarantees that the transition assigning -ROOT- has completed
        this.assignmentManager.waitForAssignment(HRegionInfo.ROOT_REGIONINFO);
        assigned++;
      } else {
        // Region already assigned. We didn't assign it. Add to in-memory state.
        this.assignmentManager.regionOnline(HRegionInfo.ROOT_REGIONINFO,
            this.catalogTracker.getRootLocation());
      }
    } else {
      // Root region has been assigned through processRegionInTransition
    }
{code}
                
> Process RIT and Master restart may remove an online server considering it as a dead server
> ------------------------------------------------------------------------------------------
>
>                 Key: HBASE-5875
>                 URL: https://issues.apache.org/jira/browse/HBASE-5875
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.94.1
>
>         Attachments: HBASE-5875.patch
>
>
> If on master restart it finds the ROOT/META to be in RIT state, master tries to assign
the ROOT region through ProcessRIT.
> Master will trigger the assignment and next will try to verify the Root Region Location.
> Root region location verification is done seeing if the RS has the region in its online
list.
> If the master triggered assignment has not yet been completed in RS then the verify root
region location will fail.
> Because it failed 
> {code}
> splitLogAndExpireIfOnline(currentRootServer);
> {code}
> we do split log and also remove the server from online server list. Ideally here there
is nothing to do in splitlog as no region server was restarted.
> So master, though the server is online, master just invalidates the region server.
> In a special case, if i have only one RS then my cluster will become non operative.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message