hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "rajeshbabu (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-6381) AssignmentManager should use the same logic for clean startup and failover
Date Thu, 20 Sep 2012 08:17:08 GMT

    [ https://issues.apache.org/jira/browse/HBASE-6381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459440#comment-13459440
] 

rajeshbabu commented on HBASE-6381:
-----------------------------------

@Jimmy, 
In the following scenario region assignment may not happen with the latest patch:

1)lets suppose a region R1 is moving from RS1 to RS2
2)if the master and RS1 restarted before update server info for R1 with RS2 in META.(during
region open in RS)
3)in rebuild user regions we will select R1 as dead region on dead server RS1.
4)Now server info updated in META with RS2.
5)In processDeadServersAndRecoverLostRegions we will expiry server and delete znode of the
region.
{code}
        if (!serverManager.isServerDead(serverName)) {
          serverManager.expireServer(serverName); // Let SSH do region re-assign
        }
        if (!nodes.isEmpty()) {
          for (HRegionInfo deadRegion : server.getValue()) {
            String encodedName = deadRegion.getEncodedName();
            if (nodes.remove(encodedName)) {
              ZKAssign.deleteNodeFailSilent(watcher, deadRegion);
            }
          }
        }
{code}
6)if the znode deletion happened before transitioning to opened,then the region wont be online.
{code}
      if (!transitionToOpened(region)) {
        // If we fail to transition to opened, it's because of one of two cases:
        //    (a) we lost our ZK lease
        // OR (b) someone else opened the region before us
        // In either case, we don't need to transition to FAILED_OPEN state.
        // In case (a), the Master will process us as a dead server. In case
        // (b) the region is already being handled elsewhere anyway.
        cleanupFailedOpen(region);
        return;
      }
{code}
Even while processing SSH of RS1 also we wont assign it because in META server info got changed
to RS2.

Please correct me if wrong.
                
> AssignmentManager should use the same logic for clean startup and failover
> --------------------------------------------------------------------------
>
>                 Key: HBASE-6381
>                 URL: https://issues.apache.org/jira/browse/HBASE-6381
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: Jimmy Xiang
>            Assignee: Jimmy Xiang
>         Attachments: hbase-6381-notes.pdf, hbase-6381.pdf, trunk-6381_v5.patch, trunk-6381_v7.patch,
trunk-6381_v8.patch
>
>
> Currently AssignmentManager handles clean startup and failover very differently.
> Different logic is mingled together so it is hard to find out which is for which.
> We should clean it up and share the same logic so that AssignmentManager handles
> both cases the same way.  This way, the code will much easier to understand and
> maintain.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message