Return-Path: Delivered-To: apmail-geronimo-activemq-users-archive@www.apache.org Received: (qmail 8955 invoked from network); 8 Jun 2006 18:09:10 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (209.237.227.199) by minotaur.apache.org with SMTP; 8 Jun 2006 18:09:10 -0000 Received: (qmail 72224 invoked by uid 500); 8 Jun 2006 18:09:09 -0000 Delivered-To: apmail-geronimo-activemq-users-archive@geronimo.apache.org Received: (qmail 72200 invoked by uid 500); 8 Jun 2006 18:09:09 -0000 Mailing-List: contact activemq-users-help@geronimo.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: activemq-users@geronimo.apache.org Delivered-To: mailing list activemq-users@geronimo.apache.org Received: (qmail 72191 invoked by uid 99); 8 Jun 2006 18:09:09 -0000 Received: from asf.osuosl.org (HELO asf.osuosl.org) (140.211.166.49) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Jun 2006 11:09:09 -0700 X-ASF-Spam-Status: No, hits=-0.0 required=10.0 tests=SPF_HELO_PASS,SPF_PASS X-Spam-Check-By: apache.org Received-SPF: pass (asf.osuosl.org: domain of lists@nabble.com designates 72.21.53.35 as permitted sender) Received: from [72.21.53.35] (HELO talk.nabble.com) (72.21.53.35) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 08 Jun 2006 11:09:08 -0700 Received: from localhost ([127.0.0.1] helo=talk.nabble.com) by talk.nabble.com with esmtp (Exim 4.50) id 1FoOw4-0007Xq-3c for activemq-users@geronimo.apache.org; Thu, 08 Jun 2006 11:08:48 -0700 Message-ID: <4777704.post@talk.nabble.com> Date: Thu, 8 Jun 2006 11:08:48 -0700 (PDT) From: red3 To: activemq-users@geronimo.apache.org Subject: Re: [Un]reliable:// network of AMQ brokers with Lingo In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Nabble-Sender: alan.biggs@aeso.ca X-Nabble-From: red3 References: <4742177.post@talk.nabble.com> <4754823.post@talk.nabble.com> <4756297.post@talk.nabble.com> <4756929.post@talk.nabble.com> <4757892.post@talk.nabble.com> <4759834.post@talk.nabble.com> X-Virus-Checked: Checked by ClamAV on apache.org X-Spam-Rating: minotaur.apache.org 1.6.2 0/1000/N James.Strachan wrote: > >> > >> > Could you (or someone) please clarify? >> > >> > >> >> >> If I am not mistaken, the failover is not a feature of network brokers >> and >> master/slave topologies. Failover is handled by the JMS client or other >> failover solutions. >> >> For the failover to work, the JMS client must be aware of the brokers in >> the >> network (failover:(tcp://master1:61616,tcp://master2:61616)). Please >> refer >> to http://www.activemq.org/site/configuring-transports.html for >> configuring >> the client connection. > > Agreed. Its a complex area this isn't it :) > > So failover of connections is a client side feature- if the broker > goes down the client can failover to another broker and resend any > in-progress messages and acknowledgements. > > The problem with networks is that they are simple store/forward by > nature - a message is owned by one broker or the other. If a broker > goes down, messages stay on that failed brokers disk until it comes > back up. For some folks this is fine. > > ... > > So if you are ever in doubt, just use master/slave :). Networks are > really only for store/forward only > -- > > James > ------- > http://radio.weblogs.com/0112098/ > > Thanks for your input James. I finally feel like we're getting somewhere. I think we only need the forwarding feature of the broker networks. To explain why, I think I need to clarify a couple of things: - We are not persisting messages in the brokers. - We are using ActiveMQ as part of the Message Driven POJO strategy with Spring and Lingo. [Ref this article by Craig Walls: http://www.jroller.com/page/habuma/20050722] - We have decided that the loss of a message here or there in a situation where a broker fails in a catastrophic way is OK. It's rare and is usually a sign of more serious issues. What we can't live with is a system where in normal everyday operation message delivery is delayed or times out regularly; or a response to a successful call is not received in a timely fashion (ideally sub-second) by the client that made it. We have two messaging scenarios within our app, which may be causing some confusion: 1) Publish/subscribe: We have services which publish their result sets to other services which then act on those results to create a new result set which eventually get published to the client. These messages use Topics set up using Lingo. All services and the client are JMS listeners. Think of it as a matrix of spreadsheets, which are all interconnected, having dependencies on one another. SOA and ESB are acronyms which definitely apply here. 2) Transactional messaging: Here we are using Spring/Lingo and ActiveMQ together as a remote procedure call transport to replace RMI, if you will. We make a transparent call through Lingo and expect a callback (or an exception). We are setting these messages up on Queues configured in Lingo. In this scenario we are having the most problems. This is where we are getting timeouts and other issues. We need this to all work against multiple brokers in a failover configuration, because our scenario cannot allow for the case where no services are available and the system comes to a halt. (We absolutely need 24/7 uptime.) It's OK if a request for an update does not complete, due to a broker failure, because another request will be made a minute or two later. As long as that second one is serviced by an alternate broker on a separate physical server. What we do need is for: a) Transactional processing under Lingo to behave in a similar way to a traditional remote procedure call client/server style system. b) For the callback to be guaranteed in normal operation (no "response received for unknown request" problems and minimal timeouts. One rare timeout when a broker goes down is acceptable. However, we are also experiencing lingering problems after recovery of a broker service.) c) When a broker goes down, it can be brought back up easily, preferably automatically. (Unless there's a catastrophic failure, like a hardware or network failure, of course.) So here are our current thoughts: We will evaluate ActiveMQ 4's network of brokers by running our tests through it to see if the problems we have using ActiveMQ 3's networks are diminished. (This is one of the questions we were trying to get an answer to in the original post. The discussion on Master/Slave has kind of clouded the issue unfortunately. It looks like a great feature, but I don't think it's for us.) If our tests still fail as badly as they are now, I think we will have to completely reconsider our strategy. Master/Slave does not help us, because it is too convaluted to bring up the master once it has failed. (It seems to be more designed as an insurance policy for catastrophic failure than a failover/load balancing solution.) And besides, it's a VERY new feature. At this stage in our project's lifecycle, frankly, it's a risk for us. Any advice on what else we should be looking at to solve our issues and get a reliable SOA in place would be greatly appreciated. Best regards, Alan P.S. I know we're in the thick of it, but I also want to go on record by saying that ActiveMQ and Lingo have been the core of our architectural design and we are delighted by the ability to run our systems at any level: locally in development on one machine as a simple java app from the command line or in a distributed production environment embedded within a container, and mocked out in our unit tests. Once we get through these issues we will be very happy to be ActiveMQ advocates. -- View this message in context: http://www.nabble.com/-Un-reliable%3A---network-of-AMQ-brokers-with-Lingo-t1744760.html#a4777704 Sent from the ActiveMQ - User forum at Nabble.com.