hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Maryann Xue (Created) (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-5829) Inconsistency between the "regions" map and the "servers" map in AssignmentManager
Date Thu, 19 Apr 2012 08:35:44 GMT
Inconsistency between the "regions" map and the "servers" map in AssignmentManager
----------------------------------------------------------------------------------

                 Key: HBASE-5829
                 URL: https://issues.apache.org/jira/browse/HBASE-5829
             Project: HBase
          Issue Type: Bug
          Components: master
    Affects Versions: 0.92.1, 0.90.6
            Reporter: Maryann Xue


There are occurrences in AM where this.servers is not kept consistent with this.regions. This
might cause balancer to offline a region from the RS that already returned NotServingRegionException
at a previous offline attempt.

In AssignmentManager.unassign(HRegionInfo, boolean)
    try {
      // TODO: We should consider making this look more like it does for the
      // region open where we catch all throwables and never abort
      if (serverManager.sendRegionClose(server, state.getRegion(),
        versionOfClosingNode)) {
        LOG.debug("Sent CLOSE to " + server + " for region " +
          region.getRegionNameAsString());
        return;
      }
      // This never happens. Currently regionserver close always return true.
      LOG.warn("Server " + server + " region CLOSE RPC returned false for " +
        region.getRegionNameAsString());
    } catch (NotServingRegionException nsre) {
      LOG.info("Server " + server + " returned " + nsre + " for " +
        region.getRegionNameAsString());
      // Presume that master has stale data.  Presume remote side just split.
      // Presume that the split message when it comes in will fix up the master's
      // in memory cluster state.
    } catch (Throwable t) {
      if (t instanceof RemoteException) {
        t = ((RemoteException)t).unwrapRemoteException();
        if (t instanceof NotServingRegionException) {
          if (checkIfRegionBelongsToDisabling(region)) {
            // Remove from the regionsinTransition map
            LOG.info("While trying to recover the table "
                + region.getTableNameAsString()
                + " to DISABLED state the region " + region
                + " was offlined but the table was in DISABLING state");
            synchronized (this.regionsInTransition) {
              this.regionsInTransition.remove(region.getEncodedName());
            }
            // Remove from the regionsMap
            synchronized (this.regions) {
              this.regions.remove(region);
            }
            deleteClosingOrClosedNode(region);
          }
        }
        // RS is already processing this region, only need to update the timestamp
        if (t instanceof RegionAlreadyInTransitionException) {
          LOG.debug("update " + state + " the timestamp.");
          state.update(state.getState());
        }
      }

In AssignmentManager.assign(HRegionInfo, RegionState, boolean, boolean, boolean)
          synchronized (this.regions) {
            this.regions.put(plan.getRegionInfo(), plan.getDestination());
          }


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message