Return-Path: Delivered-To: apmail-activemq-dev-archive@www.apache.org Received: (qmail 56771 invoked from network); 13 Mar 2009 13:27:09 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.2) by minotaur.apache.org with SMTP; 13 Mar 2009 13:27:09 -0000 Received: (qmail 76917 invoked by uid 500); 13 Mar 2009 13:27:08 -0000 Delivered-To: apmail-activemq-dev-archive@activemq.apache.org Received: (qmail 76895 invoked by uid 500); 13 Mar 2009 13:27:08 -0000 Mailing-List: contact dev-help@activemq.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@activemq.apache.org Delivered-To: mailing list dev@activemq.apache.org Received: (qmail 76884 invoked by uid 99); 13 Mar 2009 13:27:08 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Mar 2009 06:27:08 -0700 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 13 Mar 2009 13:27:01 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id E773C234C055 for ; Fri, 13 Mar 2009 06:26:40 -0700 (PDT) Message-ID: <119380939.1236950800946.JavaMail.jira@brutus> Date: Fri, 13 Mar 2009 06:26:40 -0700 (PDT) From: "ying (JIRA)" To: dev@activemq.apache.org Subject: [jira] Commented: (AMQ-2102) Master/slave out of sync with multiple consumers In-Reply-To: <1569221231.1234277580141.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: ae95407df07c98740808b2ef9da0087c X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/activemq/browse/AMQ-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=50515#action_50515 ] ying commented on AMQ-2102: --------------------------- thank you too. just finished a testing of 2 million messages with this new patch test. it works fine with no mismatch. the consumer stop consuming is due to systemUsage config, also heap needs to bump up otherwise it will get JVM java.lang.OutOfMemoryError: GC overhead limit exceeded. I am currently observing heap. I have 4 pair of master/slave, looks like the one pair which the consumer is connecting to, after 2 million msgs, has higher used heap, about >10mb higher than the rest of the broker, even after killing all producers and consumers. is there a possible leak? i will continue watch out and have more tests. > Master/slave out of sync with multiple consumers > ------------------------------------------------ > > Key: AMQ-2102 > URL: https://issues.apache.org/activemq/browse/AMQ-2102 > Project: ActiveMQ > Issue Type: Bug > Components: Broker > Affects Versions: 5.2.0 > Reporter: Dan James > Assignee: Gary Tully > Fix For: 5.3.0 > > Attachments: AMQ-2102-03102009.patch, AMQ2102.12-03.patch, master.xml, MasterSlaveBug.java, MasterSlavePatch.patch, slave.xml, slaveDispatchOnNotification.patch > > > I'm seeing exceptions like this in a simple master/slave setup: > ERROR Service - Async error occurred: javax.jms.JMSException: Slave broker out of sync with master: Dispatched message (ID:DUL1SJAMES-L2-1231-1233929569359-0:4:1:1:207) was not in the pending list for MasterSlaveBug > javax.jms.JMSException: Slave broker out of sync with master: Dispatched message (ID:DUL1SJAMES-L2-1231-1233929569359-0:4:1:1:207) was not in the pending list for MasterSlaveBug > The problem only happens when there are multiple consumers listening to the queue, and is more likely to occur as there are more consumers listening. I've written a test program that demonstrates the problem. > I start the master and slave with an empty data directory and let them both startup and settle. Then start the test program. The test program creates a specified number of consumers, and then starts queuing 256 messages. The consumers process the message by sending a reply. The producer counts the replies. Both consumers and the producer see all the messages, but with multiple consumers it is very likely that the error above will occur and several of the messages will still be queued on the slave. > While debugging through the activemq code, I noticed that both the master and the slave dispatch the message to a consumer's pending list independently. In other words, it is possible that the master will add the message to consumer A's pending list and the slave will add the message to consumer B's pending list. Once the message has been processed by consumer A, the master sends a message to the slaving which specifies consumer A so that the slave can remove the message. The slave looks on its copy of consumer A's pending list and cannot find the message. As a result, it throws this exception and the message stays stuck on consumer B's pending list on the slave. > Master and slave configurations along with MasterSlaveBug.java are attached to this issue. > Start master and slave brokers: > activemq xbean:master.xml > activemq xbean:slave.xml > Run with (only one consumer, the bug does not appear): > java -classpath .:activemq-all-5.2.0.jar MasterSlaveBug 1 > Run with (sixteen consumers, the bug does appear): > java -classpath .:activemq-all-5.2.0.jar MasterSlaveBug 16 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.