db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dag H. Wanvik (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-4186) After master stop, test fails when it succeeds in connecting (rebooting) shut-down ex-slave
Date Mon, 04 May 2009 22:54:30 GMT

    [ https://issues.apache.org/jira/browse/DERBY-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12705809#action_12705809

Dag H. Wanvik commented on DERBY-4186:

Repost answer from Jørgen (Thu, 30 Apr 2009 08:30:05 +0200) here:


Thanks for analyzing and fixing this strange issue! Stopping
replication before the startSlave command had completed was never on
my mind :-/

I had a look at you patch though, and I think you can fix this bug
with even less code.

>From SlaveDatabase.java:86:
    /** Set by the database boot thread if it fails before slave mode
     * has been started properly (i.e., if inBoot is true). This
     * exception will then be reported to the client connection. */
    private volatile StandardException bootException;

bootException is only set in one place -
SlaveDatabase#handleShutdown. There you'll also see the reason for the
limbo state that made the tests fail: if an exception makes the slave
replication code call handleShutdown while booting is in progress, the
database is supposed to be shutdown by the client thread when it
receives an exception from SlaveDatabase.boot().

As you already found out, that didn't happen because the bootException
was set during the 500 millis waiting in verifySuccesfulBoot. However,
this should apply to any exception in bootException, not only
DATABASE_SEVERITY ones (although I *think* only DB severity exceptions
will be reported here).

I would go with the same code that is inside the while. Thus, instead of

+        if (bootException != null &&
+            SQLState.SHUTDOWN_DATABASE.startsWith(
+                bootException.getSQLState()) &&
+            bootException.getSeverity() ==
ExceptionSeverity.DATABASE_SEVERITY) {


+        if (bootException != null)

> After master stop, test fails when it succeeds in connecting (rebooting) shut-down ex-slave
> -------------------------------------------------------------------------------------------
>                 Key: DERBY-4186
>                 URL: https://issues.apache.org/jira/browse/DERBY-4186
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication, Test
>    Affects Versions:
>            Reporter: Dag H. Wanvik
>            Assignee: Dag H. Wanvik
>         Attachments: bad-slave.txt, derby-4186-2.diff, derby-4186-2.stat, derby-4186.diff,
derby-4186.stat, ok-slave.txt
> Occasionally I see this error in ReplicationRun_Local_3_p3:
> 1) testReplication_Local_3_p3_StateNegativeTests(org.apache.derbyTesting.functionTests.tests.replicationTests.ReplicationRun_Local_3_p3)junit.framework.AssertionFailedError:
Expected SQLState'08004', but got connection!
> 	at org.apache.derbyTesting.functionTests.tests.replicationTests.ReplicationRun.waitForSQLState(ReplicationRun.java:332)
> 	at org.apache.derbyTesting.functionTests.tests.replicationTests.ReplicationRun_Local_3_p3.testReplication_Local_3_p3_StateNegativeTests(ReplicationRun_Local_3_p3.java:170)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 	at org.apache.derbyTesting.junit.BaseTestCase.runBare(BaseTestCase.java:105)
> 	at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
> 	at junit.extensions.TestSetup$1.protect(TestSetup.java:21)
> 	at junit.extensions.TestSetup.run(TestSetup.java:25)
> In the code, after a stopMaster is given to the master (should lead to fail-over),
> the tests expects to see CANNOT_CONNECT_TO_DB_IN_SLAVE_MODE (08004.C.7), which will only
succeed if
> the tests gets to try to connect before the failover has started. This seems wrong. If
the failover has completed, it should expect a successful
> connect (which boots the database, btw, since its shut down after auccessful failover).
> Quote from code:
> waitForSQLState("08004", 100L, 20, // 08004.C.7 - CANNOT_CONNECT_TO_DB_IN_SLAVE_MODE
>                 slaveDatabasePath + FS + slaveDbSubPath + FS + replicatedDb,
>                 slaveServerHost, slaveServerPort); // _failOver above fails...
> There is a race between the failover on the slave and the test here I think.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

View raw message