hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Hudson (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-3946) The splitted region can be online again while the standby hmaster becomes the active one
Date Tue, 28 Jun 2011 22:43:32 GMT

    [ https://issues.apache.org/jira/browse/HBASE-3946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13056860#comment-13056860
] 

Hudson commented on HBASE-3946:
-------------------------------

Integrated in HBase-TRUNK #1995 (See [https://builds.apache.org/job/HBase-TRUNK/1995/])
    

> The splitted region can be online again while the standby hmaster becomes the active
one
> ----------------------------------------------------------------------------------------
>
>                 Key: HBASE-3946
>                 URL: https://issues.apache.org/jira/browse/HBASE-3946
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.3
>            Reporter: Jieshan Bean
>            Assignee: Jieshan Bean
>             Fix For: 0.90.4
>
>         Attachments: HBASE-3946-V2.patch, HBASE-3946.patch
>
>
> (The cluster has two HMatser, one active and one standby)
> 1.While the active HMaster shutdown, the standby one would become the active one, and
went into the processFailover() method:
>     if (regionCount == 0) {
>       LOG.info("Master startup proceeding: cluster startup");
>       this.assignmentManager.cleanoutUnassigned();
>       this.assignmentManager.assignAllUserRegions();
>     } else {
>       
>       LOG.info("Master startup proceeding: master failover");
>       this.assignmentManager.processFailover();
>     }
> 2.After that, the user regions would be rebuild.
>   Map<HServerInfo,List<Pair<HRegionInfo,Result>>> deadServers = rebuildUserRegions();

> 3.Here's how the rebuildUserRegions worked. All the regions(contain the splitted regions)
would be added to the offlineRegions of offlineServers.
>    for (Result result : results) {
>       Pair<HRegionInfo,HServerInfo> region =
>         MetaReader.metaRowToRegionPairWithInfo(result);
>       if (region == null) continue;
>       HServerInfo regionLocation = region.getSecond();
>       HRegionInfo regionInfo = region.getFirst();
>       if (regionLocation == null) {
>         // Region not being served, add to region map with no assignment
>         // If this needs to be assigned out, it will also be in ZK as RIT
>         this.regions.put(regionInfo, null);
>       } else if (!serverManager.isServerOnline(
>           regionLocation.getServerName())) {
>         // Region is located on a server that isn't online
>         List<Pair<HRegionInfo,Result>> offlineRegions =
>           offlineServers.get(regionLocation);
>         if (offlineRegions == null) {
>           offlineRegions = new ArrayList<Pair<HRegionInfo,Result>>(1);
>           offlineServers.put(regionLocation, offlineRegions);
>         }
>         offlineRegions.add(new Pair<HRegionInfo,Result>(regionInfo, result));
>       } else {
>         // Region is being served and on an active server
>         regions.put(regionInfo, regionLocation);
>         addToServers(regionLocation, regionInfo);
>       }
>     }
> 4.It seems that all the offline regions will be added to RIT and online again:
> ZKAssign will creat node for each offline never consider the splitted ones. 
> AssignmentManager# processDeadServers
>   private void processDeadServers(
>       Map<HServerInfo, List<Pair<HRegionInfo, Result>>> deadServers)
>   throws IOException, KeeperException {
>     for (Map.Entry<HServerInfo, List<Pair<HRegionInfo,Result>>> deadServer
:
>       deadServers.entrySet()) {
>       List<Pair<HRegionInfo,Result>> regions = deadServer.getValue();
>       for (Pair<HRegionInfo,Result> region : regions) {
>         HRegionInfo regionInfo = region.getFirst();
>         Result result = region.getSecond();
>         // If region was in transition (was in zk) force it offline for reassign
>         try {
>           ZKAssign.createOrForceNodeOffline(watcher, regionInfo,
>               master.getServerName());
>         } catch (KeeperException.NoNodeException nne) {
>           // This is fine
>         }
>         // Process with existing RS shutdown code
>         ServerShutdownHandler.processDeadRegion(regionInfo, result, this,
>             this.catalogTracker);
>       }
>     }
>   }
> AssignmentManager# processFailover
>     // Process list of dead servers
>     processDeadServers(deadServers);
>     // Check existing regions in transition
>     List<String> nodes = ZKUtil.listChildrenAndWatchForNewChildren(watcher,
>         watcher.assignmentZNode);
>     if (nodes.isEmpty()) {
>       LOG.info("No regions in transition in ZK to process on failover");
>       return;
>     }
>     LOG.info("Failed-over master needs to process " + nodes.size() +
>         " regions in transition");
>     for (String encodedRegionName: nodes) {
>       processRegionInTransition(encodedRegionName, null);
>     }
> So I think before add the region into RIT, check it at first.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message