db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Knut Anders Hatlen (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (DERBY-5192) Setting up network server for management tests hangs intermittently
Date Fri, 15 Apr 2011 09:35:05 GMT

    [ https://issues.apache.org/jira/browse/DERBY-5192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13020253#comment-13020253
] 

Knut Anders Hatlen commented on DERBY-5192:
-------------------------------------------

When looking at the thread dump for the hung process, I see the
following:

- There is a server thread that's blocked on a call to
  shutdownSync.wait() in NetworkServerControlImpl.blockingStart().
  This means the network server is up and running, waiting for a
  shutdown signal.

- There is no ClientThread, whereas one would expect one if the
  network server was up. This is the thread that accepts incoming
  connections from the clients.

- The main JUnit thread is stuck in a read call on the client socket.
  This is consistent with what one would expect when there is no
  ClientThread. The client is able to connect to the server, but the
  server won't process the connection.

Inspecting a heap dump of the process, I notice that the shutdown flag
in the NetworkServerControlImpl instance is actually true. So it looks
like the server has been shut down, but it's still waiting for the
shutdown signal. The only way I can see that this can happen, is if
the shutdown signal is sent before blockingStart() has come to the
shutdownSync.wait() call. The wait/notify code here isn't quite
idiomatic. The wait() call isn't placed inside a while loop checking
the wait condition, so if the shutdown happens too early, the server
won't ever notice the notifyAll() call that tells it to stop waiting.

We have many existing bug reports about problems when starting the
network server from NetworkServerTestSetup (hangs, exceptions printed
by ping(), and timeouts waiting for the network server to start). Some
of these may have the same underlying cause as this bug. A quick JIRA
search gave these hits: DERBY-4176, DERBY-4201, DERBY-4238,
DERBY-4265, DERBY-4319, DERBY-4739, DERBY-4914.

> Setting up network server for management tests hangs intermittently
> -------------------------------------------------------------------
>
>                 Key: DERBY-5192
>                 URL: https://issues.apache.org/jira/browse/DERBY-5192
>             Project: Derby
>          Issue Type: Bug
>          Components: Network Server
>    Affects Versions: 10.8.1.1
>         Environment: Derby 10.8.1.0 - FreeBSD 8.2 (i386) - OpenJDK 6 (b20)
> Derby 10.8.1.0 - Oracle Enterprise Linux 6.0 (x86_64) - OpenJDK 6 (b17)
> Derby 10.8.1.1 - Debian GNU/Linux 6.0.1 (i386) - JDK 7 (build 1.7.0-ea-b135)
>            Reporter: Knut Anders Hatlen
>            Assignee: Knut Anders Hatlen
>
> I've seen on three occurrences with the 10.8.1.0 and 10.8.1.1 release candidates that
suites.All has been stuck when setting up the network server decorator for the management
test suite. Here's what I see at the end of the console output (running with derby.tests.trace=true):
>     [junit] test_jdbc4_1 used 0 ms 
>     [junit] test_jdbc4_1 used 0 ms 
>     [junit] test_notBooted used 1623 ms java.net.SocketException: Connection reset
>     [junit] 	at java.net.SocketInputStream.read(SocketInputStream.java:189)
>     [junit] 	at java.net.SocketInputStream.read(SocketInputStream.java:121)
>     [junit] 	at java.net.SocketInputStream.read(SocketInputStream.java:107)
>     [junit] 	at org.apache.derby.impl.drda.NetworkServerControlImpl.fillReplyBuffer(Unknown
Source)
>     [junit] 	at org.apache.derby.impl.drda.NetworkServerControlImpl.readResult(Unknown
Source)
>     [junit] 	at org.apache.derby.impl.drda.NetworkServerControlImpl.pingWithNoOpen(Unknown
Source)
>     [junit] 	at org.apache.derby.impl.drda.NetworkServerControlImpl.ping(Unknown Source)
>     [junit] 	at org.apache.derby.drda.NetworkServerControl.ping(Unknown Source)
>     [junit] 	at org.apache.derbyTesting.junit.NetworkServerTestSetup.pingForServerUp(NetworkServerTestSetup.java:567)
>     [junit] 	at org.apache.derbyTesting.junit.NetworkServerTestSetup.pingForServerStart(NetworkServerTestSetup.java:636)
>     [junit] 	at org.apache.derbyTesting.junit.NetworkServerTestSetup.setUp(NetworkServerTestSetup.java:196)
>     [junit] 	at junit.extensions.TestSetup$1.protect(TestSetup.java:20)
>     [junit] 	at junit.framework.TestResult.runProtected(TestResult.java:124)
>     [junit] 	at junit.extensions.TestSetup.run(TestSetup.java:25)
>     [junit] 	at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57)
>     [junit] 	at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
>     [junit] 	at junit.extensions.TestSetup$1.protect(TestSetup.java:21)
>     [junit] 	at junit.framework.TestResult.runProtected(TestResult.java:124)
>     [junit] 	at junit.extensions.TestSetup.run(TestSetup.java:25)
>     [junit] 	at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
>     [junit] 	at junit.extensions.TestSetup$1.protect(TestSetup.java:21)
>     [junit] 	at junit.framework.TestResult.runProtected(TestResult.java:124)
>     [junit] 	at junit.extensions.TestSetup.run(TestSetup.java:25)
>     [junit] 	at org.apache.derbyTesting.junit.BaseTestSetup.run(BaseTestSetup.java:57)
>     [junit] 	at junit.framework.TestSuite.runTest(TestSuite.java:230)
>     [junit] 	at junit.framework.TestSuite.run(TestSuite.java:225)
>     [junit] 	at junit.framework.TestSuite.runTest(TestSuite.java:230)
>     [junit] 	at junit.framework.TestSuite.run(TestSuite.java:225)
>     [junit] 	at junit.framework.TestSuite.runTest(TestSuite.java:230)
>     [junit] 	at junit.framework.TestSuite.run(TestSuite.java:225)
>     [junit] 	at junit.framework.TestSuite.runTest(TestSuite.java:230)
>     [junit] 	at junit.framework.TestSuite.run(TestSuite.java:225)
>     [junit] 	at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.run(JUnitTestRunner.java:518)
>     [junit] 	at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.launch(JUnitTestRunner.java:1052)
>     [junit] 	at org.apache.tools.ant.taskdefs.optional.junit.JUnitTestRunner.main(JUnitTestRunner.java:906)
> The exception trace is just printed to the console, but it doesn't make any test fail.
(The exception printed on FreeBSD was different, it said "DRDA_InvalidReplyTooShort.S:Invalid
reply from network server: Insufficient data." The other two looked like the one above.)
> All the hangs have happened on VirtualBox instances, though with different guest operating
systems and JVMs. Probably the timing is different from what we have on physical machines.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message