hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Kannan Muthukkaruppan (JIRA)" <j...@apache.org>
Subject [jira] Created: (HBASE-2881) TestAdmin intermittent failures: Race condition during createTable can result in region double assignment
Date Tue, 27 Jul 2010 03:03:16 GMT
TestAdmin intermittent failures: Race condition during createTable can result in region double

                 Key: HBASE-2881
                 URL: https://issues.apache.org/jira/browse/HBASE-2881
             Project: HBase
          Issue Type: Bug
            Reporter: Kannan Muthukkaruppan

The TestAdmin test fails on trunk intermittently because it is unable to "enable" a "disabled"
table. However, the root cause seems to be that much earlier, at "createTable" time the table's
region got assigned to 2 region servers. And this later confuses the "disable"/"enable" code.

createTable goes down to RegionManager.java:createRegion:

public void createRegion(HRegionInfo newRegion, HRegionInterface server,
      byte [] metaRegionName)
  throws IOException {
    // 2. Create the HRegion
    HRegion region = HRegion.createHRegion(newRegion, this.master.getRootDir(),

    // 3. Insert into meta
    HRegionInfo info = region.getRegionInfo();
    byte [] regionName = region.getRegionName();

    Put put = new Put(regionName);
    server.put(metaRegionName, put);

    // 4. Close the new region to flush it to disk.  Close its log file too.

    // 5. Get it assigned to a server
    setUnassigned(info, true);

Between, after #3, but before #5, if the MetaScanner runs, it'll find this region in unassigned
state and also assign it out.

And then #5 comes along at again "force" sets this region to be unassigned... causing it to
get assigned again to a different region server (as part of the RegionManager's job of assigning
out regions waiting to be assigned along with region server heart beats).


The test in question that diffs is TestAdmin:testHundredsOfTable(). I tried repro'ing this
more reliable by modifying the test to have the metascanner run more frequently:

  TEST_UTIL.getConfiguration().setInt("hbase.master.meta.thread.rescanfrequency", 1000);//
1 seconds

(instead of the default 60seconds); but it didn't help improve the reproducibility.


This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message