activemq-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From red3 <alan.bi...@aeso.ca>
Subject Re: [Un]reliable:// network of AMQ brokers with Lingo
Date Thu, 08 Jun 2006 18:08:48 GMT


James.Strachan wrote:
> 
>> >
>> > Could you (or someone) please clarify?
>> >
>> >
>>
>>
>> If I am not mistaken, the failover is not a feature of network brokers
>> and
>> master/slave topologies.  Failover is handled by the JMS client or other
>> failover solutions.
>>
>> For the failover to work, the JMS client must be aware of the brokers in
>> the
>> network (failover:(tcp://master1:61616,tcp://master2:61616)).  Please
>> refer
>> to http://www.activemq.org/site/configuring-transports.html for
>> configuring
>> the client connection.
> 
> Agreed. Its a complex area this isn't it :)
> 
> So failover of connections is a client side feature- if the broker
> goes down the client can failover to another broker and resend any
> in-progress messages and acknowledgements.
> 
> The problem with networks is that they are simple store/forward by
> nature - a message is owned by one broker or the other. If a broker
> goes down, messages stay on that failed brokers disk until it comes
> back up. For some folks this is fine.
> 
> ...
> 
> So if you are ever in doubt, just use master/slave :). Networks are
> really only for store/forward only
> --
> 
> James
> -------
> http://radio.weblogs.com/0112098/
> 
> 

Thanks for your input James. I finally feel like we're getting somewhere.

I think we only need the forwarding feature of the broker networks. To
explain why, I think I need to clarify a couple of things:

- We are not persisting messages in the brokers. 

- We are using ActiveMQ as part of the Message Driven POJO strategy with
Spring and Lingo. 
[Ref this article by Craig Walls:
http://www.jroller.com/page/habuma/20050722]

- We have decided that the loss of a message here or there in a situation
where a broker fails in a catastrophic way is OK. It's rare and is usually a
sign of more serious issues. What we can't live with is a system where in
normal everyday operation message delivery is delayed or times out
regularly; or a response to a successful call is not received in a timely
fashion (ideally sub-second) by the client that made it.

We have two messaging scenarios within our app, which may be causing some
confusion:
1) Publish/subscribe: We have services which publish their result sets to
other services which then act on those results to create a new result set
which eventually get published to the client. These messages use Topics set
up using Lingo. All services and the client are JMS listeners. Think of it
as a matrix of spreadsheets, which are all interconnected, having
dependencies on one another. SOA and ESB are acronyms which definitely apply
here. 
2) Transactional messaging: Here we are using Spring/Lingo and ActiveMQ
together as a remote procedure call transport to replace RMI, if you will.
We make a transparent call through Lingo and expect a callback (or an
exception). We are setting these messages up on Queues configured in Lingo.
In this scenario we are having the most problems. This is where we are
getting timeouts and other issues.

We need this to all work against multiple brokers in a failover
configuration, because our scenario cannot allow for the case where no
services are available and the system comes to a halt. (We absolutely need
24/7 uptime.) It's OK if a request for an update does not complete, due to a
broker failure, because another request will be made a minute or two later.
As long as that second one is serviced by an alternate broker on a separate
physical server.

What we do need is for:
a) Transactional processing under Lingo to behave in a similar way to a
traditional remote procedure call client/server style system.
b) For the callback to be guaranteed in normal operation (no "response
received for unknown request" problems and minimal timeouts. One rare
timeout when a broker goes down is acceptable. However, we are also
experiencing lingering problems after recovery of a broker service.)
c) When a broker goes down, it can be brought back up easily, preferably
automatically. (Unless there's a catastrophic failure, like a hardware or
network failure, of course.)

So here are our current thoughts:
We will evaluate ActiveMQ 4's network of brokers by running our tests
through it to see if the problems we have using ActiveMQ 3's networks are
diminished. (This is one of the questions we were trying to get an answer to
in the original post. The discussion on Master/Slave has kind of clouded the
issue unfortunately. It looks like a great feature, but I don't think it's
for us.)

If our tests still fail as badly as they are now, I think we will have to
completely reconsider our strategy.

Master/Slave does not help us, because it is too convaluted to bring up the
master once it has failed. (It seems to be more designed as an insurance
policy for catastrophic failure than a failover/load balancing solution.)
And besides, it's a VERY new feature. At this stage in our project's
lifecycle, frankly, it's a risk for us.

Any advice on what else we should be looking at to solve our issues and get
a reliable SOA in place would be greatly appreciated.

Best regards,

Alan

P.S. I know we're in the thick of it, but I also want to go on record by
saying that ActiveMQ and Lingo have been the core of our architectural
design and we are delighted by the ability to run our systems at any level:
locally in development on one machine as a simple java app from the command
line or in a distributed production environment embedded within a container,
and mocked out in our unit tests. Once we get through these issues we will
be very happy to be ActiveMQ advocates.

--
View this message in context: http://www.nabble.com/-Un-reliable%3A---network-of-AMQ-brokers-with-Lingo-t1744760.html#a4777704
Sent from the ActiveMQ - User forum at Nabble.com.


Mime
View raw message