hbase-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "nkeywal (Created) (JIRA)" <j...@apache.org>
Subject [jira] [Created] (HBASE-4832) TestRegionServerCoprocessorExceptionWithAbort fails if the region server stops too fast
Date Sun, 20 Nov 2011 08:30:51 GMT
TestRegionServerCoprocessorExceptionWithAbort fails if the region server stops too fast
---------------------------------------------------------------------------------------

                 Key: HBASE-4832
                 URL: https://issues.apache.org/jira/browse/HBASE-4832
             Project: HBase
          Issue Type: Bug
          Components: coprocessors, test
    Affects Versions: 0.94.0
            Reporter: nkeywal
            Priority: Minor


The current implementation of HRegionServer#stop is

{noformat}
  public void stop(final String msg) {
    this.stopped = true;
    LOG.info("STOPPED: " + msg);
    synchronized (this) {
      // Wakes run() if it is sleeping
      notifyAll(); // FindBugs NN_NAKED_NOTIFY
    }
  }
{noformat}

The notification is sent on the wrong object and does nothing. As a consequence, the region
server continues to sleep instead of waking up and stopping immediately. A correct implementation
is:

{noformat}
  public void stop(final String msg) {
    this.stopped = true;
    LOG.info("STOPPED: " + msg);
    // Wakes run() if it is sleeping
    sleeper.skipSleepCycle();
  }
{noformat}

Then the region server stops immediately. This makes the region server stops 0,5s faster on
average, which is quite useful for unit tests.

However, with this fix, TestRegionServerCoprocessorExceptionWithAbort does not work.
It likely because the code does no expect the region server to stop that fast.

The exception is:
{noformat}
testExceptionFromCoprocessorDuringPut(org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort)
 Time elapsed: 30.06 sec  <<< ERROR!
java.lang.Exception: test timed out after 30000 milliseconds
	at java.lang.Throwable.fillInStackTrace(Native Method)
	at java.lang.Throwable.<init>(Throwable.java:196)
	at java.lang.Exception.<init>(Exception.java:41)
	at java.lang.InterruptedException.<init>(InterruptedException.java:48)
	at java.lang.Thread.sleep(Native Method)
	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1019)
	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:804)
	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.relocateRegion(HConnectionManager.java:778)
	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionLocation(HConnectionManager.java:697)
	at org.apache.hadoop.hbase.client.ServerCallable.connect(ServerCallable.java:75)
	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1280)
	at org.apache.hadoop.hbase.client.HTable.getRowOrBefore(HTable.java:585)
	at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:154)
	at org.apache.hadoop.hbase.client.MetaScanner.access$000(MetaScanner.java:52)
	at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:130)
	at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:127)
	at org.apache.hadoop.hbase.client.HConnectionManager.execute(HConnectionManager.java:357)
	at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:127)
	at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:103)
	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:866)
	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:920)
	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:808)
	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1469)
	at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1354)
	at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:892)
	at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:750)
	at org.apache.hadoop.hbase.client.HTable.put(HTable.java:725)
	at org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort.testExceptionFromCoprocessorDuringPut(TestRegionServerCoprocessorExceptionWithAbort.java:84)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45)
	at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15)
	at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42)
	at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20)
	at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:62)
{noformat}

We have this exception because we entered a loop of retries.




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message