Return-Path: X-Original-To: apmail-hbase-issues-archive@www.apache.org Delivered-To: apmail-hbase-issues-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id F3837772C for ; Tue, 22 Nov 2011 00:26:04 +0000 (UTC) Received: (qmail 89336 invoked by uid 500); 22 Nov 2011 00:26:04 -0000 Delivered-To: apmail-hbase-issues-archive@hbase.apache.org Received: (qmail 89308 invoked by uid 500); 22 Nov 2011 00:26:04 -0000 Mailing-List: contact issues-help@hbase.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Delivered-To: mailing list issues@hbase.apache.org Received: (qmail 89298 invoked by uid 99); 22 Nov 2011 00:26:04 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Nov 2011 00:26:04 +0000 X-ASF-Spam-Status: No, hits=-2001.2 required=5.0 tests=ALL_TRUSTED,RP_MATCHES_RCVD X-Spam-Check-By: apache.org Received: from [140.211.11.116] (HELO hel.zones.apache.org) (140.211.11.116) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Nov 2011 00:26:00 +0000 Received: from hel.zones.apache.org (hel.zones.apache.org [140.211.11.116]) by hel.zones.apache.org (Postfix) with ESMTP id 5834C8A238 for ; Tue, 22 Nov 2011 00:25:40 +0000 (UTC) Date: Tue, 22 Nov 2011 00:25:40 +0000 (UTC) From: "Ted Yu (Commented) (JIRA)" To: issues@hbase.apache.org Message-ID: <735392393.425.1321921540362.JavaMail.tomcat@hel.zones.apache.org> In-Reply-To: <632852680.48696.1321777851686.JavaMail.tomcat@hel.zones.apache.org> Subject: [jira] [Commented] (HBASE-4832) TestRegionServerCoprocessorExceptionWithAbort fails if the region server stops too fast MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/HBASE-4832?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13154766#comment-13154766 ] Ted Yu commented on HBASE-4832: ------------------------------- The test failures were due to 'Too many open files' > TestRegionServerCoprocessorExceptionWithAbort fails if the region server stops too fast > --------------------------------------------------------------------------------------- > > Key: HBASE-4832 > URL: https://issues.apache.org/jira/browse/HBASE-4832 > Project: HBase > Issue Type: Bug > Components: coprocessors, test > Affects Versions: 0.94.0 > Reporter: nkeywal > Assignee: Eugene Koontz > Priority: Minor > Attachments: 4832-timeout.txt, 4832_trunk_hregionserver.patch, HBASE-4832.patch, HBASE-4832.patch > > > The current implementation of HRegionServer#stop is > {noformat} > public void stop(final String msg) { > this.stopped = true; > LOG.info("STOPPED: " + msg); > synchronized (this) { > // Wakes run() if it is sleeping > notifyAll(); // FindBugs NN_NAKED_NOTIFY > } > } > {noformat} > The notification is sent on the wrong object and does nothing. As a consequence, the region server continues to sleep instead of waking up and stopping immediately. A correct implementation is: > {noformat} > public void stop(final String msg) { > this.stopped = true; > LOG.info("STOPPED: " + msg); > // Wakes run() if it is sleeping > sleeper.skipSleepCycle(); > } > {noformat} > Then the region server stops immediately. This makes the region server stops 0,5s faster on average, which is quite useful for unit tests. > However, with this fix, TestRegionServerCoprocessorExceptionWithAbort does not work. > It likely because the code does no expect the region server to stop that fast. > The exception is: > {noformat} > testExceptionFromCoprocessorDuringPut(org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort) Time elapsed: 30.06 sec <<< ERROR! > java.lang.Exception: test timed out after 30000 milliseconds > at java.lang.Throwable.fillInStackTrace(Native Method) > at java.lang.Throwable.(Throwable.java:196) > at java.lang.Exception.(Exception.java:41) > at java.lang.InterruptedException.(InterruptedException.java:48) > at java.lang.Thread.sleep(Native Method) > at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:1019) > at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:804) > at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.relocateRegion(HConnectionManager.java:778) > at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionLocation(HConnectionManager.java:697) > at org.apache.hadoop.hbase.client.ServerCallable.connect(ServerCallable.java:75) > at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getRegionServerWithRetries(HConnectionManager.java:1280) > at org.apache.hadoop.hbase.client.HTable.getRowOrBefore(HTable.java:585) > at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:154) > at org.apache.hadoop.hbase.client.MetaScanner.access$000(MetaScanner.java:52) > at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:130) > at org.apache.hadoop.hbase.client.MetaScanner$1.connect(MetaScanner.java:127) > at org.apache.hadoop.hbase.client.HConnectionManager.execute(HConnectionManager.java:357) > at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:127) > at org.apache.hadoop.hbase.client.MetaScanner.metaScan(MetaScanner.java:103) > at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.prefetchRegionCache(HConnectionManager.java:866) > at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegionInMeta(HConnectionManager.java:920) > at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.locateRegion(HConnectionManager.java:808) > at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatchCallback(HConnectionManager.java:1469) > at org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.processBatch(HConnectionManager.java:1354) > at org.apache.hadoop.hbase.client.HTable.flushCommits(HTable.java:892) > at org.apache.hadoop.hbase.client.HTable.doPut(HTable.java:750) > at org.apache.hadoop.hbase.client.HTable.put(HTable.java:725) > at org.apache.hadoop.hbase.coprocessor.TestRegionServerCoprocessorExceptionWithAbort.testExceptionFromCoprocessorDuringPut(TestRegionServerCoprocessorExceptionWithAbort.java:84) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) > at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) > at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) > at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) > at org.junit.internal.runners.statements.FailOnTimeout$StatementThread.run(FailOnTimeout.java:62) > {noformat} > We have this exception because we entered a loop of retries. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira