Return-Path: Delivered-To: apmail-geronimo-activemq-users-archive@www.apache.org Received: (qmail 52629 invoked from network); 22 Aug 2006 20:20:18 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 22 Aug 2006 20:20:18 -0000 Received: (qmail 39998 invoked by uid 500); 22 Aug 2006 20:20:18 -0000 Delivered-To: apmail-geronimo-activemq-users-archive@geronimo.apache.org Received: (qmail 39775 invoked by uid 500); 22 Aug 2006 20:20:17 -0000 Mailing-List: contact activemq-users-help@geronimo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: activemq-users@geronimo.apache.org Delivered-To: mailing list activemq-users@geronimo.apache.org Received: (qmail 39766 invoked by uid 99); 22 Aug 2006 20:20:17 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Aug 2006 13:20:17 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: domain of lists@nabble.com designates 72.21.53.35 as permitted sender) Received: from [72.21.53.35] (HELO talk.nabble.com) (72.21.53.35) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 22 Aug 2006 13:20:16 -0700 Received: from [72.21.53.38] (helo=jubjub.nabble.com) by talk.nabble.com with esmtp (Exim 4.50) id 1GFcj5-0004kt-Ok for activemq-users@geronimo.apache.org; Tue, 22 Aug 2006 13:19:55 -0700 Message-ID: <5932916.post@talk.nabble.com> Date: Tue, 22 Aug 2006 13:19:55 -0700 (PDT) From: Pawel Tucholski To: activemq-users@geronimo.apache.org Subject: AMQ3.2.2, network of brokers and multicast discovery - bug MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N Hi Does anybody know a successful production deployment of ActiveMQ 3.2.2 with network of brokers and multicast discovery? Few days ago my colleague was asking about bug 275, now we have found another even more serous bug that is causing that brokers in clusters are going down in a simple anomality test. This is suspected root cause of the bug 275 in our system. Configuration: - 2 or more brokers, activemq 3.2.2 - multicast discovery This occurs in following scenario: 1. Start of broker1 2. Start of broker2 3. Brokers discover themselves, all works fine. 4. Broker2 is disconnected from network - it is still alive. 5. about 10 sec later, broker1 goes down - below the stack trace from BrokerContainerImpl.stop from broker1: java.lang.Exception =09at org.activemq.broker.impl.BrokerContainerImpl.stop(BrokerContainerImpl.java:= 269) =09at org.activemq.ActiveMQConnectionFactory.stop(ActiveMQConnectionFactory.java:= 687) =09at org.activemq.ActiveMQConnectionFactory.onConnectionClose(ActiveMQConnection= Factory.java:962) =09at org.activemq.ActiveMQConnection.close(ActiveMQConnection.java:775) =09at org.activemq.transport.NetworkChannel.stop(NetworkChannel.java:218) =09at org.activemq.transport.DiscoveryNetworkConnector.removeChannel(DiscoveryNet= workConnector.java:124) =09at org.activemq.transport.DiscoveryNetworkConnector.removeService(DiscoveryNet= workConnector.java:76) =09at org.activemq.transport.DiscoveryAgentSupport.fireRemoveService(DiscoveryAge= ntSupport.java:66) =09at org.activemq.transport.multicast.MulticastDiscoveryAgent.fireServiceStopped= (MulticastDiscoveryAgent.java:403) =09at org.activemq.transport.multicast.MulticastDiscoveryAgent.removeService(Mult= icastDiscoveryAgent.java:421) =09at org.activemq.transport.multicast.MulticastDiscoveryAgent.checkNodesAlive(Mu= lticastDiscoveryAgent.java:433) =09at org.activemq.transport.multicast.MulticastDiscoveryAgent.run(MulticastDisco= veryAgent.java:306) =09at java.lang.Thread.run(Thread.java:568) 6. any attempt to connect to broker1 fail - it is not listening on the configured 61616 port. Tomorrow we are going to test same scenario with ActiveMQ 4.0.1. Is it a known feature? I haven't found a bug report for this. Maybe someone knows the fix in code? I suspect it is not a difficult - some of the stop/remove method is too much greedy in destroying of components. It looks like having more than one broker is more risky, at least we know that when the last Client disconnects from the broker then the broker is still alive :-) Regards Pawe=C5=82 Tucholski --=20 View this message in context: http://www.nabble.com/AMQ3.2.2%2C-network-of-= brokers-and-multicast-discovery---bug-tf2148514.html#a5932916 Sent from the ActiveMQ - User forum at Nabble.com.