activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From yinghe0101 <>
Subject Re: Slave broker out of sync with master
Date Thu, 17 Jul 2008 13:30:09 GMT

hi, gary,

I will see what I can do with the test but my scenario is a little
different. on the slave, I used failover as masterConnectorURI( eg.
masterConnectorURI="failover://(tcp://master:61616)) because we want to make
sure when we start master and slave, slave is attached. Using tcp, it might
be a chance slave starts as a master because when it cannot connect to the
master, it will start, also shutdownOnMasterFailure for tcp does not work( Even it works, we don't
want slave to try only once and stop.

The issue of using failover for masterConnectorURI is when it reconnects, it
does not send the BrokerInfo, so the master will not know it is a slave.
Attached you can find my fix for this.

The Slave broker out of sync exception only happens when I use failover for
masterConnectorURI and kill the master, restart the master and after the
reconnect is established, then I use a producer to send some messages, there
are other transacted sessions connecting to the master at the time to
consume-process-publish-then-commit the message so it is a little
complicated but our application requires that. 

A note is if I do either of the following, this exception will not occur:
1. delete both master/slave's data dir ( just for test, it cannot happen in
a production environment)
2. before starting master, copy master's data dir to slave without killing
slave for its reconnect to master trial.

Another note is on the client side, only the master's uri is among the
failover list so the slave is only for replication purpose. 

Since my setup is a little complicated, it might be hard to code-test it but
I will see what I can do. I hope this explanation is clear and any
suggestion is appreciated.


here is the patch:
--- src/main/java/org/apache/activemq/broker/ft/
(revision 672308)
+++ src/main/java/org/apache/activemq/broker/ft/
(working copy)
@@ -70,6 +70,7 @@
     private SessionInfo sessionInfo;
     private ProducerInfo producerInfo;
     private final AtomicBoolean masterActive = new AtomicBoolean();
+    private BrokerInfo brokerInfo;
     public MasterConnector() {
@@ -99,6 +100,7 @@
         if (!started.compareAndSet(false, true)) {
         if (remoteURI == null) {
             throw new IllegalArgumentException("You must specify a
@@ -120,6 +122,7 @@
             public void onCommand(Object o) {
                 Command command = (Command)o;
+                LOG.debug("## remoteBroker command:"+command);
                 if (started.get()) {
@@ -130,7 +133,17 @@
+            public void transportResumed() {
+            	try{
+            		remoteBroker.oneway(brokerInfo);
+            	}catch(IOException e){
+            		LOG.error("MasterConnector failed to send BrokerInfo in
transportResumed:", e);
+            	}
+   "MasterConnector sent BrokerInfo when transport
+            }
         try {
@@ -139,7 +152,7 @@
         } catch (Exception e) {
             LOG.error("Failed to start network bridge: " + e, e);
-        }    
+        }  
     protected void startBridge() throws Exception {
@@ -148,10 +161,8 @@
+        connectionInfo.setBrokerMasterConnector(true);
-        ConnectionInfo remoteInfo = new ConnectionInfo();
-        connectionInfo.copy(remoteInfo);
-        remoteInfo.setBrokerMasterConnector(true);
         sessionInfo = new SessionInfo(connectionInfo, 1);
@@ -159,7 +170,6 @@
         producerInfo = new ProducerInfo(sessionInfo, 1);
-        BrokerInfo brokerInfo = null;
         if (connector != null) {
             brokerInfo = connector.getBrokerInfo();
         } else {

Gary Tully wrote:
> ying,
> do you think it would be possible to build a test case that reproduced
> the problem. Possibly based on QueueMasterSlaveTest[1] or based on
> something similar?
> [1]
> 2008/7/16 yinghe0101 <>:
>> hi,
>> With the latest trunk, i still get the following:
>> javax.jms.JMSException: Slave broker out of sync with master: Dispatched
>> message (ID:yhe-3822-1216229856070-0:0:1:1:1) was not in the pending list
>> thus the messageAck will fail because it is not in the dispatch list
>> From some investigation, i found that the MessageDispatchNotification
>> happens before the message is adding to the pending in
>> PrefetchSubscription.
>> The following order needs to be enforced ( slave adding message to
>> pending-->slave get MessageDispatchNotification -->slave get MessageAck).
>> somehow there is a race condition which breaks the sync between the slave
>> and master
>> I was trying to look into how the pending messages gets added on the
>> slave,
>> any explanation or suggestion is appreciated. Thank you.
>> ying
>> --
>> View this message in context:
>> Sent from the ActiveMQ - Dev mailing list archive at
View this message in context:
Sent from the ActiveMQ - Dev mailing list archive at

View raw message