db-derby-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Dag H. Wanvik (JIRA)" <j...@apache.org>
Subject [jira] Commented: (DERBY-4186) After failover, test fails when it succeeds in connecting early to failed over slave
Date Mon, 27 Apr 2009 15:20:30 GMT

    [ https://issues.apache.org/jira/browse/DERBY-4186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12703166#action_12703166
] 

Dag H. Wanvik commented on DERBY-4186:
--------------------------------------

Thanks for clarifying the expected behavior here, Ole. It wasn't clear to me what should happen
on the
slave when the master is stopped. 

I quote from docs: "To stop replication of a database, connect to the master database using
the stopMaster=true connection URL attribute. The master sends the remaining log records that
await shipment, and then sends a stop replication command to the slave. The slave then writes
all logs to disk and shuts down the database."

This seems to conflict with the spec: "without shutting down the database". But the code seems
to try to shut down the db.
So I think docs are correct, the spec is out of date on this point..?

I have investigated a bit and it seems what happens when the connection is denied on the slave,
is this:
- The slave receives the stop message
- It  stops reading log, and breaks of recover,
- The boot halts, but the db doesn't really shut down completely.

I haven't been able to determine why it doesn't shut down fully yet. 



> After failover, test fails when it succeeds in connecting early to failed over slave
> ------------------------------------------------------------------------------------
>
>                 Key: DERBY-4186
>                 URL: https://issues.apache.org/jira/browse/DERBY-4186
>             Project: Derby
>          Issue Type: Bug
>          Components: Replication, Test
>    Affects Versions: 10.6.0.0
>            Reporter: Dag H. Wanvik
>
> Occasionally I see this error in ReplicationRun_Local_3_p3:
> 1) testReplication_Local_3_p3_StateNegativeTests(org.apache.derbyTesting.functionTests.tests.replicationTests.ReplicationRun_Local_3_p3)junit.framework.AssertionFailedError:
Expected SQLState'08004', but got connection!
> 	at org.apache.derbyTesting.functionTests.tests.replicationTests.ReplicationRun.waitForSQLState(ReplicationRun.java:332)
> 	at org.apache.derbyTesting.functionTests.tests.replicationTests.ReplicationRun_Local_3_p3.testReplication_Local_3_p3_StateNegativeTests(ReplicationRun_Local_3_p3.java:170)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 	at org.apache.derbyTesting.junit.BaseTestCase.runBare(BaseTestCase.java:105)
> 	at junit.extensions.TestDecorator.basicRun(TestDecorator.java:24)
> 	at junit.extensions.TestSetup$1.protect(TestSetup.java:21)
> 	at junit.extensions.TestSetup.run(TestSetup.java:25)
> In the code, after a stopMaster is given to the master (should lead to fail-over),
> the tests expects to see CANNOT_CONNECT_TO_DB_IN_SLAVE_MODE (08004.C.7), which will only
succeed if
> the tests gets to try to connect before the failover has started. This seems wrong. If
the failover has completed, it should expect a successful
> connect (which boots the database, btw, since its shut down after auccessful failover).
> Quote from code:
> waitForSQLState("08004", 100L, 20, // 08004.C.7 - CANNOT_CONNECT_TO_DB_IN_SLAVE_MODE
>                 slaveDatabasePath + FS + slaveDbSubPath + FS + replicatedDb,
>                 slaveServerHost, slaveServerPort); // _failOver above fails...
> There is a race between the failover on the slave and the test here I think.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


Mime
View raw message