activemq-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From red3 <alan.bi...@aeso.ca>
Subject [Un]reliable:// network of AMQ brokers with Lingo
Date Tue, 06 Jun 2006 22:20:56 GMT

We are using Lingo 1.1 and ActiveMQ 3.1 to achieve RPC through Spring and JMS
(Message-Driven POJOs).

Things were fine until... We set up two ActiveMQ brokers in a
reliable/failover configuration.
We use the reliable: protocol with two fixed port addresses.
E.g. 
    <bean id="jmsBrokerUrl" abstract="true">
    	<property name="brokerURL"  
value="reliable:(tcp://localhost:61616%3FsoTimeout=5000,tcp://localhost:
61617%3FsoTimeout=5000)?maximumRetries=0&amp;establishConnectionTimeout=
21000&amp;keepAliveTimeout=300000"/>
    </bean>

(in our production environment the brokers are at separate IP
addresses.)

In this scenario we have experienced random, but frequent, timeouts at the
client. It seems that the outgoing message gets to the server fine
(resulting in a database operation), but the response to the client is
either lost or delayed.

We have also experienced this error at the client: 
	org.logicblaze.lingo.jms.impl.MultiplexingRequestor: Response received for
unknown request

And in some scenarios we have seen the same request being processed more
than once on the server.

Suffice it to say, for now we are running just one broker in production!

For the record, our client is Swing, and we are forced in our organization
to use the Oracle OC4J container to host the JMS in production. However,
thanks to Spring/Lingo we are able to run independent of the container in
development environments and unit tests.

I have written unit tests to simulate sending many client requests on five
different threads. Some requests are on Topics and others on temporary
Queues.

We have many theories but no conclusions as yet. We experience different
problems in various different scenarios. We have lost connections and
experienced timeouts when taking down a broker and then starting it back up.
We have experienced different problems when the container has to be
restarted.
We have run our tests successfully several times with just one broker in
place.

We have several theories we are evaluating - your input as to which you feel
is most likely would be much appreciated:
1) That our configuration is somehow flawed.
2) That Lingo has not been thoroughly tested in the scenario described above
and either has bugs to be resolved or is not designed for this scenario.
3) That ActiveMQ is somehow dropping connections, or not correlating request
correctly between the brokers.
4) That there is a problem with the container intercepting/blocking the
requests. - We deem this unlikely since we are connecting directly to the
JMS brokers through the port addresses defined in Lingo.

So without going into too much detail at this point, can you confirm
that:
1) Lingo and a network of ActiveMQ brokers is a feasible combination for
realiable failover in an enterprise environment.
2) That temporary queues and topics are not lost or dropped in the
dual-broker scenario.
3) That the container is not intercepting or blocking requests in the
dual-broker scenario. (JTA conflicts?)
	(Note that OC4J is not JMS 1.1 compliant, but we are overriding its JMS
with embedded ActiveMQ in production.)

A dump of the exceptions is included below:

Concurrent timeout exception:

Cannot access JMS invoker remote service at [null]; nested exception is
javax.jms.JMSException: EDU.oswego.cs.dl.util.concurrent.TimeoutException
org.springframework.remoting.RemoteAccessException: Cannot access JMS
invoker remote service at [null]; nested exception is
javax.jms.JMSException: EDU.oswego.cs.dl.util.concurrent.TimeoutException
javax.jms.JMSException: EDU.oswego.cs.dl.util.concurrent.TimeoutException
	at
org.logicblaze.lingo.jms.impl.MultiplexingRequestor.createJMSException(MultiplexingRequestor.java:156)
...etc

Response received for unknown request:

 WARN: org.logicblaze.lingo.jms.impl.MultiplexingRequestor: Response
received for unknown request: ACTIVEMQ_OBJECT_MESSAGE: id = 0
ActiveMQMessage{ , jmsMessageID = ID:AD050003-3535-1149547683183-64:35,
bodyAsBytes = org.activemq.io.util.ByteArray@1eef2c, readOnlyMessage = true,
jmsClientID = 'ID:AD050003-1862-1149537087845-6:0' , jmsCorrelationID = '52'
, jmsDestination =
TemporaryQueue-{TD{ID:AD050003-3564-1149547697636-89:0}TD}ID:AD050003-35
64-1149547697636-97:0, jmsReplyTo = null, jmsDeliveryMode = 1,
jmsRedelivered = false, jmsType = 'null' , jmsExpiration = 0, jmsPriority =
4, jmsTimestamp = 1149548125466, properties = null, readOnlyProperties =
true, entryBrokerName = 'broker1' , entryClusterName = 'default' ,
consumerNos = [0], transactionId = 'null'
, xaTransacted = false, consumerIdentifer =
'ID:AD050003-3564-1149547697636-89:0.2.1' , messageConsumed = false,
transientConsumed = false, sequenceNumber = 97, deliveryCount = 1,
dispatchedFromDLQ = false, messageAcknowledge =
org.activemq.ActiveMQSession@1bf7b23, jmsMessageIdentity = null, producerKey
= ID:AD050003-1862-1149537087845-627: } ActiveMQObjectMessage{ object =
org.springframework.remoting.support.RemoteInvocationResult@d99277 }

This one occurs when the request is processed twice, usually happens if a
broker is taken down and then restarted:

org.springframework.dao.DataIntegrityViolationException: Hibernate
operation: Could not execute JDBC batch update; SQL []; ORA-00001:
unique constraint (ABIGGS.OV_PARID_EFFDT_EXP_OVERRIDE_AK) violated ; nested
exception is java.sql.BatchUpdateException: ORA-00001: unique constraint
(ABIGGS.OV_PARID_EFFDT_EXP_OVERRIDE_AK) violated

--
View this message in context: http://www.nabble.com/-Un-reliable%3A---network-of-AMQ-brokers-with-Lingo-t1744760.html#a4742177
Sent from the ActiveMQ - User forum at Nabble.com.


Mime
View raw message