db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Mamta A. Satoor (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-4053) suites.All hang with message java.net.BindException: Address already in use: NET_Bind in derby.log
Date Tue, 30 Jun 2009 14:00:47 GMT

    [ https://issues.apache.org/jira/browse/DERBY-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12725630#action_12725630
] 

Mamta A. Satoor commented on DERBY-4053:
----------------------------------------

Based on Kathey's suggestion, I tried putting sleep in the Network Server code right in the
middle of ping protocol handshake.

Following is what happens for ping on the server side (in NetworkServerControlImpl)
private void sendOK(DDMWriter writer) throws Exception 
{ 
      writeCommandReplyHeader(writer); 
      writer.writeByte(OK); 
      writer.flush(); 
} 
I have copied the sendOK code inline where the ping is handled in NetworkServerControlImpl.processCommands().
Additionally, I changed that copied code to have the server sleep after writing the header
but before sending the ok to the ping client as shown below.
      writeCommandReplyHeader(writer);
      writer.flush();
      System.out.println("before going to sleep");
      Thread.sleep(10000);
      System.out.println("after sleep");
      writer.writeByte(OK);
      System.out.println("after sending OK");
      writer.flush();
      System.out.println("after flushing OK");
With the code changes above, I thought I would be able to reproduce the bug if I tried shutting
down server while the server was still sleeping during ping handshake (ie before the ping
protocol handshake is all finished). What I found was that the server shutdown properly, ping
client got expected Invalid reply from network server: Insufficient data. We thought that
if we tried bringing the server back up and tried ping on the new server session, it will
hang because of the earlier insufficient data but that didn't happen. A hang here would have
probably duplicated the intermittent hang behavior that we see when the nightly tests are
running.

Little more info on exact steps for the test case above
Window 1 : Start the server
	java org.apache.derby.drda.NetworkServerControl -noSecurityManager start -p 1639
Window 2 : ping the server (this put the server in the sleep mode)
	java org.apache.derby.drda.NetworkServerControl ping -p 1639
Window 3: while server is sleeping, send shutdown request
	java org.apache.derby.drda.NetworkServerControl shutdown -p 1639

After spending more time on the experiment above, found that the ping client was getting
insufficient data because of the "writer.flush()" which I added right before 
Thread.sleep(...). This happened both with Sun and IBM jvms(1.6 versions). once I took the

additional writer.flush() out, the ping client ran successfully and there was no insufficient
data error.

The goal here is to get a consistently (small) reproducible test case which will make debugging
the problem easier but have not been to cause the ping to run into insufficient data in a
small repro yet. Will brainstorm more but in the meantime, if anyone has any ideas
on what may be causing the insufficient data error, I can pursue those.

> suites.All hang with message java.net.BindException: Address already in use: NET_Bind
in derby.log 
> ---------------------------------------------------------------------------------------------------
>
>                 Key: DERBY-4053
>                 URL: https://issues.apache.org/jira/browse/DERBY-4053
>             Project: Derby
>          Issue Type: Bug
>          Components: Network Server, Test
>    Affects Versions: 10.5.1.1
>            Reporter: Kathey Marsden
>         Attachments: derby-4053_repro_dont_commit_diff.txt, derby.log, javacore-20090420-1735.txt,
javacore.20090211.123031.4000.0001.txt, suites.All.out
>
>
> Running suites.All with IBM 1.5  on 10.5.0.0 alpha - (743198)  I got a hang in the test
run.  The last test to run successfully was xtestNestedSavepoints, but I am not sure exactly
what test caused  the hang.  I took a thread dump which I will attach, which showed network
server up and running but no ClientThread and a ping attempt blocked.
> This hang is very similar to the hang that was seen after the fix attempts for DERBY-1465
but that change was backed out so it is not related to that change.   It could be that the
change for DERBY-1465 just made this highly intermittent problem more likely.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message