hbase-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Shuaifeng Zhou (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-14931) Active master switches may cause region close forever
Date Sat, 05 Dec 2015 06:46:10 GMT
Shuaifeng Zhou created HBASE-14931:
--------------------------------------

             Summary: Active master switches may cause region close forever
                 Key: HBASE-14931
                 URL: https://issues.apache.org/jira/browse/HBASE-14931
             Project: HBase
          Issue Type: Bug
          Components: master
    Affects Versions: 0.98.10
            Reporter: Shuaifeng Zhou
            Priority: Critical
             Fix For: 0.98.17


60010 webpage shows that a region is online on one RS, but when access data in the region
throw notServingRegion. After lookup the source code and logs, found that it's because active
master switches during the region openning:
1, master1 open region 'region1', sent open region request to rs and create node in zk
2, master1 stoped
3, master2 became active master
4, master2 obtain all region status,  'region1' status is offline
5, rs opened 'region1' node changed to opened in zk, and sent message to master2
6, master2 received RS_ZK_REGION_OPENED, but the status is not pending open or openning, sent
unassign to rs, 'region1' closed
{code:title=AssignmentManager.java|borderStyle=solid}
        case RS_ZK_REGION_OPENED:
          // Should see OPENED after OPENING but possible after PENDING_OPEN.
          if (regionState == null
              || !regionState.isPendingOpenOrOpeningOnServer(sn)) {
            LOG.warn("Received OPENED for " + prettyPrintedRegionName
              + " from " + sn + " but the region isn't PENDING_OPEN/OPENING here: "
              + regionStates.getRegionState(encodedName));

            if (regionState != null) {
              // Close it without updating the internal region states,
              // so as not to create double assignments in unlucky scenarios
              // mentioned in OpenRegionHandler#process
              unassign(regionState.getRegion(), null, -1, null, false, sn);
            }
            return;
          }
{code}
7, master2 continue handle regioninfo when master1 stoped, found that 'region1' status in
zk is opened, update status in memory to opened.
8, up to now, 'region1' status is opened on webpage of master status, but not opened on any
regionserver.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message