db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Knut Anders Hatlen (Updated) (JIRA)" <j...@apache.org>
Subject [jira] [Updated] (DERBY-5643) Occasional hangs in replication tests on Linux
Date Tue, 13 Mar 2012 15:34:39 GMT

     [ https://issues.apache.org/jira/browse/DERBY-5643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Knut Anders Hatlen updated DERBY-5643:
--------------------------------------

    Attachment: higher-timeout.diff

Attaching a new patch (higher-timeout.diff) that changes the default server startup timeout
in NetworkServerTestSetup to four minutes. If the server startup eventually succeeds, but
takes more than one minute, it'll call alarm(...) to tell that something isn't quite as it
should be, but it won't actually fail unless it has been unsuccessful for four minutes.

The patch also makes the replication tests and the compatibility tests use NetworkServerTestSetup's
helper methods to ping the server while waiting for it to start. This makes it possible to
remove some duplicate logic from those tests, and makes the tests behave in a more consistent
manner.

I've run derbyall, suites.All and the compatibility tests successfully with the patch. I've
also had the replication tests running in a loop for four hours on a machine where I saw frequent
hangs before. No hangs so far, but I've seen two occurrences of the alarm message, which indicates
that the condition that previously made the tests hang did happen:

ALARM: Very slow server startup: 189735 ms
ALARM: Very slow server startup: 189850 ms
                
> Occasional hangs in replication tests on Linux
> ----------------------------------------------
>
>                 Key: DERBY-5643
>                 URL: https://issues.apache.org/jira/browse/DERBY-5643
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication, Test
>    Affects Versions: 10.9.0.0
>            Reporter: Knut Anders Hatlen
>         Attachments: higher-timeout.diff, thread-dump.txt, waitFor-2.diff, waitFor.diff
>
>
> We occasionally see hangs in the replication tests on Linux. For example here: http://dbtg.foundry.sun.com/derby/test/Daily/jvm1.6/testing/testlog/sles/1298470-suitesAll_diff.txt
> This test run was stuck in tearDown() after ReplicationRun_Local_Derby4910.testSlaveWaitsForMaster().
(Waiting for Thread.join() to return.)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Mime
View raw message