activemq-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ARTEMIS-1570) SharedNothingBackup does not replicate all journal from live
Date Thu, 18 Jan 2018 18:18:00 GMT

    [ https://issues.apache.org/jira/browse/ARTEMIS-1570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16330889#comment-16330889
] 

ASF subversion and git services commented on ARTEMIS-1570:
----------------------------------------------------------

Commit c4bfb9521fd322c7179d31d5b5f7acf3f25d32dd in activemq-artemis's branch refs/heads/master
from shoukun
[ https://git-wip-us.apache.org/repos/asf?p=activemq-artemis.git;h=c4bfb95 ]

ARTEMIS-1570 Flush appendExecutor before take journal snapshot

When live start replication, it must make sure there is
no pending write in message & bindings journal, or we may
lost journal records during initial replication.

So we need flush append executor after acquire StorageManager's
write lock, before Journal's write lock.
Also we set a 10 seconds timeout when flush, the same as
Journal::flushExecutor. If we failed to flush in 10 seconds,
we abort replication, backup will try again later.

Use OrderedExecutorFactory::flushExecutor to flush executor


> SharedNothingBackup does not replicate all journal from live
> ------------------------------------------------------------
>
>                 Key: ARTEMIS-1570
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-1570
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 2.4.0
>         Environment: i'm running unit test on windows.
>            Reporter: shoukun huai
>            Priority: Critical
>         Attachments: SharedNothingReplicationTest.java
>
>
> I try to test replication when live is in heavy IO load.
> Attached is my junit test.
> The test use a slow message persister to simulate live is busy on IO, so JournalImpl's
`appendExecutor` is busy.
> After start live server, send 5 messages each with a property `delay` of 5000 ms, then
start the backup server, wait until it is replicated. Then send more messages without delay.
> Stop live and backup after all message sent, then check message journal.
> Backup will miss 2 message/journal entry.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Mime
View raw message