hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Lars Hofhansl (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-11037) Race condition in TestZKBasedOpenCloseRegion
Date Fri, 18 Apr 2014 23:28:17 GMT
Lars Hofhansl created HBASE-11037:
-------------------------------------

             Summary: Race condition in TestZKBasedOpenCloseRegion
                 Key: HBASE-11037
                 URL: https://issues.apache.org/jira/browse/HBASE-11037
             Project: HBase
          Issue Type: Bug
            Reporter: Lars Hofhansl
             Fix For: 0.94.19


testCloseRegion is called before testReOpenRegion.

Here's the sequence of events:
{code}
2014-04-18 20:58:05,645 INFO  [Thread-380] master.TestZKBasedOpenCloseRegion(313): Running
testCloseRegion
2014-04-18 20:58:05,645 INFO  [Thread-380] master.TestZKBasedOpenCloseRegion(315): Number
of region servers = 2
2014-04-18 20:58:05,645 INFO  [Thread-380] master.TestZKBasedOpenCloseRegion(164): -ROOT-,,0.70236052
2014-04-18 20:58:05,646 DEBUG [Thread-380] master.TestZKBasedOpenCloseRegion(320): Asking
RS to close region -ROOT-,,0.70236052
...
2014-04-18 20:58:06,237 INFO  [RS_CLOSE_ROOT-hemera.apache.org,46533,1397854669633-0] regionserver.HRegion(1148):
Closed -ROOT-,,0.70236052
...
2014-04-18 20:58:06,404 INFO  [Thread-380] master.TestZKBasedOpenCloseRegion(333): Done with
testCloseRegion
{code}
Then
{code}
2014-04-18 20:58:06,431 INFO  [pool-1-thread-1] hbase.ResourceChecker(157): before master.TestZKBasedOpenCloseRegion#testReOpenRegion:
234 threads, 388 file descriptors 4 connections, 
...
2014-04-18 20:58:06,466 DEBUG [MASTER_OPEN_REGION-hemera.apache.org,52650,1397854669138-3]
zookeeper.ZKUtil(1597): master:52650-0x14576a1835d0000 Retrieved 62 byte(s) of data from znode
/hbase/unassigned/70236052; data=region=-ROOT-,,0, origin=hemera.apache.org,46533,1397854669633,
state=RS_ZK_REGION_OPENED
2014-04-18 20:58:06,473 DEBUG [pool-1-thread-1] client.ClientScanner(191): Finished with scanning
at {NAME => '.META.,,1', STARTKEY => '', ENDKEY => '', ENCODED => 1028785192,}
2014-04-18 20:58:06,473 INFO  [Thread-396] master.TestZKBasedOpenCloseRegion(123): Number
of region servers = 2
2014-04-18 20:58:06,474 INFO  [Thread-396] master.TestZKBasedOpenCloseRegion(164): -ROOT-,,0.70236052
2014-04-18 20:58:06,474 DEBUG [Thread-396] master.TestZKBasedOpenCloseRegion(130): Asking
RS to close region -ROOT-,,0.70236052
2014-04-18 20:58:06,474 INFO  [Thread-396] master.TestZKBasedOpenCloseRegion(147): Unassign
-ROOT-,,0.70236052
2014-04-18 20:58:06,474 DEBUG [Thread-396] master.AssignmentManager(2126): Starting unassignment
of region -ROOT-,,0.70236052 (offlining)
2014-04-18 20:58:06,475 DEBUG [Thread-396] master.AssignmentManager(2132): Attempted to unassign
region -ROOT-,,0.70236052 but it is not currently assigned anywhere
2014-04-18 20:58:06,478 DEBUG [pool-1-thread-1-EventThread] zookeeper.ZooKeeperWatcher(294):
master:52650-0x14576a1835d0000 Received ZooKeeper Event, type=NodeDeleted, state=SyncConnected,
path=/hbase/unassigned/70236052
2014-04-18 20:58:06,478 DEBUG [pool-1-thread-1-EventThread] master.AssignmentManager(1176):
The znode of region -ROOT-,,0.70236052 has been deleted.
2014-04-18 20:58:06,478 INFO  [pool-1-thread-1-EventThread] master.AssignmentManager(1188):
The master has opened the region -ROOT-,,0.70236052 that was online on hemera.apache.org,46533,1397854669633
2014-04-18 20:58:06,478 DEBUG [pool-1-thread-1-EventThread] zookeeper.ZooKeeperWatcher(294):
master:52650-0x14576a1835d0000 Received ZooKeeper Event, type=NodeChildrenChanged, state=SyncConnected,
path=/hbase/unassigned
{code}
Then nothing happens. So testCloseRegion unassigns the ROOT region and testReOpenRegion starts
before ROOT is reassigned. Hence it waits forever for the close event, since it never happens.

This is the key "master.AssignmentManager(2132): Attempted to unassign region -ROOT-,,0.70236052
but it is not currently assigned anywhere"

The easiest fix is to just run testCloseRegion last (as it was before we switched junit).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Mime
View raw message