hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "chunhui shen (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (HBASE-8287) TestRegionMergeTransactionOnCluster failed in trunk build #4010
Date Sun, 07 Apr 2013 03:45:29 GMT

     [ https://issues.apache.org/jira/browse/HBASE-8287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

chunhui shen updated HBASE-8287:
--------------------------------

    Attachment: hbase-trunk-8287.patch

The root cause of bug:
In DispatchMergingRegionHandler#process
{code}
region_b_location = masterServices.getAssignmentManager()
              .getRegionStates().getRegionServerOfRegion(region_b);
          onSameRS = region_a_location.equals(region_b_location);
          if (onSameRS || !regionStates.isRegionInTransition(region_b)) {
            // Regions are on the same RS, or region_b is not in
            // RegionInTransition any more
            break;
          }
{code}

The value of onSameRS will be false even if regions are already on the same regionserver as
the following case:

1.getRegionServerOfRegion(region_b)
2.region_b online
3.isRegionInTransition(region_b)

The above steps is synchronized by RegionStates.
Step 1 return the old server of region_b, thus the value of onSameRS is false;
Step 2 return false since region_b is online after step 2.
Finally, the while block is break and onSameRS is false but the merging regions are on the
same regionserver in fact
                
> TestRegionMergeTransactionOnCluster failed in trunk build #4010
> ---------------------------------------------------------------
>
>                 Key: HBASE-8287
>                 URL: https://issues.apache.org/jira/browse/HBASE-8287
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: chunhui shen
>            Assignee: chunhui shen
>            Priority: Minor
>             Fix For: 0.95.2, 0.98.0
>
>         Attachments: hbase-trunk-8287.patch
>
>
> From the log of trunk build #4010:
> {code}
> 2013-04-04 05:45:59,396 INFO  [MASTER_TABLE_OPERATIONS-quirinus.apache.org,53514,1365054344859-0]
handler.DispatchMergingRegionHandler(157): 
> Cancel merging regions testCleanMergeReference,,1365054353296.bf3d60360122d6c83a246f5f96c2cdd1.,
> testCleanMergeReference,testRow0020,1365054353302.72fbc04566e78aa6732531296256a5aa.,

> because can't move them together after 842ms
>  
> 2013-04-04 05:45:59,396 INFO  [hbase-am-zkevent-worker-pool-2-thread-1] master.AssignmentManager$4(1164):
> The master has opened the region testCleanMergeReference,testRow0020,1365054353302.72fbc04566e78aa6732531296256a5aa.
that was onlin
> e on quirinus.apache.org,45718,1365054345790
> {code}
> There's a small probability that fail to move merging regions together to same regionserver

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message