hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "stack (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (HBASE-8940) TestRegionMergeTransactionOnCluster#testWholesomeMerge may fail due to race in opening region
Date Fri, 12 Jul 2013 22:13:49 GMT

    [ https://issues.apache.org/jira/browse/HBASE-8940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13707453#comment-13707453
] 

stack commented on HBASE-8940:
------------------------------

I think this a test issue.  What you think [~jxiang]?  We are asking to merge a region that
the regionserver is NotServering.   Merge should fail.

I think this issue will be fixed by the Enis's "HBASE-8934 Fix bunch of flaky tests" -- the
region will be fully opened before the test starts.
                
> TestRegionMergeTransactionOnCluster#testWholesomeMerge may fail due to race in opening
region
> ---------------------------------------------------------------------------------------------
>
>                 Key: HBASE-8940
>                 URL: https://issues.apache.org/jira/browse/HBASE-8940
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Ted Yu
>            Assignee: Ted Yu
>         Attachments: 8940-v1.txt
>
>
> From http://54.241.6.143/job/HBase-TRUNK-Hadoop-2/org.apache.hbase$hbase-server/395/testReport/org.apache.hadoop.hbase.regionserver/TestRegionMergeTransactionOnCluster/testWholesomeMerge/
:
> {code}
> 013-07-11 09:33:44,154 INFO  [AM.ZK.Worker-pool-2-thread-2] master.RegionStates(309):
Offlined 3ffefd878a234031675de6b2c70b2ead from ip-10-174-118-204.us-west-1.compute.internal,60498,1373535184820
> 2013-07-11 09:33:44,154 INFO  [AM.ZK.Worker-pool-2-thread-2] master.AssignmentManager$4(1223):
The master has opened testWholesomeMerge,testRow0020,1373535210125.3ffefd878a234031675de6b2c70b2ead.
that was online on ip-10-174-118-204.us-west-1.compute.internal,59210,1373535184884
> 2013-07-11 09:33:44,182 DEBUG [RS_OPEN_REGION-ip-10-174-118-204:59210-1] zookeeper.ZKAssign(862):
regionserver:59210-0x13fcd13a20c0002 Successfully transitioned node 3ffefd878a234031675de6b2c70b2ead
from RS_ZK_REGION_OPENING to RS_ZK_REGION_OPENED
> 2013-07-11 09:33:44,182 INFO  [MASTER_TABLE_OPERATIONS-ip-10-174-118-204:39405-0] handler.DispatchMergingRegionHandler(154):
Failed send MERGE REGIONS RPC to server ip-10-174-118-204.us-west-1.compute.internal,59210,1373535184884
for region testWholesomeMerge,,1373535210124.efcb10dcfa250e31bfd50dc6c7049f32.,testWholesomeMerge,testRow0020,1373535210125.3ffefd878a234031675de6b2c70b2ead.,
focible=false, org.apache.hadoop.hbase.exceptions.RegionOpeningException: Region is being
opened: 3ffefd878a234031675de6b2c70b2ead
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:2566)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3862)
> 	at org.apache.hadoop.hbase.regionserver.HRegionServer.mergeRegions(HRegionServer.java:3649)
> 	at org.apache.hadoop.hbase.protobuf.generated.AdminProtos$AdminService$2.callBlockingMethod(AdminProtos.java:14400)
> 	at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2124)
> 	at org.apache.hadoop.hbase.ipc.RpcServer$Handler.run(RpcServer.java:1831)
> 2013-07-11 09:33:44,182 DEBUG [RS_OPEN_REGION-ip-10-174-118-204:59210-1] handler.OpenRegionHandler(373):
region transitioned to opened in zookeeper: {ENCODED => 3ffefd878a234031675de6b2c70b2ead,
NAME => 'testWholesomeMerge,testRow0020,1373535210125.3ffefd878a234031675de6b2c70b2ead.',
STARTKEY => 'testRow0020', ENDKEY => 'testRow0040'}, server: ip-10-174-118-204.us-west-1.compute.internal,59210,1373535184884
> 2013-07-11 09:33:44,183 DEBUG [RS_OPEN_REGION-ip-10-174-118-204:59210-1] handler.OpenRegionHandler(186):
Opened testWholesomeMerge,testRow0020,1373535210125.3ffefd878a234031675de6b2c70b2ead. on server:ip-10-174-118-204.us-west-1.compute.internal,59210,1373535184884
> {code}
> We can see that MASTER_TABLE_OPERATIONS thread couldn't get region 3ffefd878a234031675de6b2c70b2ead
because RS_OPEN_REGION thread finished region opening 1 millisecond later.
> One solution is to retry operation when receiving RegionOpeningException

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message