hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Jieshan Bean (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-3946) The splitted region can be online again while the standby hmaster becomes the active one
Date Thu, 02 Jun 2011 00:56:47 GMT
The splitted region can be online again while the standby hmaster becomes the active one
----------------------------------------------------------------------------------------

                 Key: HBASE-3946
                 URL: https://issues.apache.org/jira/browse/HBASE-3946
             Project: HBase
          Issue Type: Bug
    Affects Versions: 0.90.3
            Reporter: Jieshan Bean
            Assignee: Jieshan Bean
             Fix For: 0.90.4


(The cluster has two HMatser, one active and one standby)

1.While the active HMaster shutdown, the standby one would become the active one, and went
into the processFailover() method:
    if (regionCount == 0) {
      LOG.info("Master startup proceeding: cluster startup");
      this.assignmentManager.cleanoutUnassigned();
      this.assignmentManager.assignAllUserRegions();
    } else {
      
      LOG.info("Master startup proceeding: master failover");
      this.assignmentManager.processFailover();
    }
2.After that, the user regions would be rebuild.
  Map<HServerInfo,List<Pair<HRegionInfo,Result>>> deadServers = rebuildUserRegions();


3.Here's how the rebuildUserRegions worked. All the regions(contain the splitted regions)
would be added to the offlineRegions of offlineServers.

   for (Result result : results) {
      Pair<HRegionInfo,HServerInfo> region =
        MetaReader.metaRowToRegionPairWithInfo(result);
      if (region == null) continue;
      HServerInfo regionLocation = region.getSecond();
      HRegionInfo regionInfo = region.getFirst();
      if (regionLocation == null) {
        // Region not being served, add to region map with no assignment
        // If this needs to be assigned out, it will also be in ZK as RIT
        this.regions.put(regionInfo, null);
      } else if (!serverManager.isServerOnline(
          regionLocation.getServerName())) {
        // Region is located on a server that isn't online
        List<Pair<HRegionInfo,Result>> offlineRegions =
          offlineServers.get(regionLocation);
        if (offlineRegions == null) {
          offlineRegions = new ArrayList<Pair<HRegionInfo,Result>>(1);
          offlineServers.put(regionLocation, offlineRegions);
        }
        offlineRegions.add(new Pair<HRegionInfo,Result>(regionInfo, result));
      } else {
        // Region is being served and on an active server
        regions.put(regionInfo, regionLocation);
        addToServers(regionLocation, regionInfo);
      }
    }

4.It seems that all the offline regions will be added to RIT and online again:
ZKAssign will creat node for each offline never consider the splitted ones. 

AssignmentManager# processDeadServers
  private void processDeadServers(
      Map<HServerInfo, List<Pair<HRegionInfo, Result>>> deadServers)
  throws IOException, KeeperException {
    for (Map.Entry<HServerInfo, List<Pair<HRegionInfo,Result>>> deadServer
:
      deadServers.entrySet()) {
      List<Pair<HRegionInfo,Result>> regions = deadServer.getValue();
      for (Pair<HRegionInfo,Result> region : regions) {
        HRegionInfo regionInfo = region.getFirst();
        Result result = region.getSecond();
        // If region was in transition (was in zk) force it offline for reassign
        try {
          ZKAssign.createOrForceNodeOffline(watcher, regionInfo,
              master.getServerName());
        } catch (KeeperException.NoNodeException nne) {
          // This is fine
        }
        // Process with existing RS shutdown code
        ServerShutdownHandler.processDeadRegion(regionInfo, result, this,
            this.catalogTracker);
      }
    }
  }

AssignmentManager# processFailover
    // Process list of dead servers
    processDeadServers(deadServers);
    // Check existing regions in transition
    List<String> nodes = ZKUtil.listChildrenAndWatchForNewChildren(watcher,
        watcher.assignmentZNode);
    if (nodes.isEmpty()) {
      LOG.info("No regions in transition in ZK to process on failover");
      return;
    }
    LOG.info("Failed-over master needs to process " + nodes.size() +
        " regions in transition");
    for (String encodedRegionName: nodes) {
      processRegionInTransition(encodedRegionName, null);
    }

So I think before add the region into RIT, check it at first.


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message