activemq-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From mdasari <mdas...@gmail.com>
Subject FailoverTransport stops working after a while
Date Thu, 02 Apr 2009 16:47:39 GMT

Hi,

We are using AMQ 5.1.0 on some of our servers. We noticed that (on few
servers) after a while the AMQ failover transport stops working thus making
messages to be not delivered. (from a producer AMQ server box to a central
consumer AMQ server box through camel)

--------------------------------------------------------------
The following is the data from our log files:
--------------------------------------------------------------
INFO   | jvm 1    | 2009/03/16 21:25:42 | DEBUG FailoverTransport             
- Connection established
INFO   | jvm 1    | 2009/03/16 21:25:42 | INFO  FailoverTransport             
- Successfully connected to tcp://10.87.129.196:61616
INFO   | jvm 1    | 2009/03/16 21:25:43 | DEBUG JmsConfiguration$2            
- Executing callback on JMS Session: ActiveMQSession
{id=ID:LOCALMQ-3675-1236961500048-2:218:1,started=false}
INFO   | jvm 1    | 2009/03/16 21:25:43 | DEBUG JmsProducer                   
- Endpoint[centralMQ:topic:Topic1] sending JMS message: ActiveMQTextMessage
{...}
INFO   | jvm 1    | 2009/03/16 21:25:43 | DEBUG JmsConfiguration$2            
- Sending created message: ActiveMQTextMessage {...}
INFO   | jvm 1    | 2009/03/16 21:25:43 | DEBUG ActiveMQSession               
- ID:LOCALMQ-3675-1236961500048-2:218:1 sending message: ActiveMQTextMessage
{...}
INFO   | jvm 1    | 2009/03/16 21:25:43 | DEBUG FailoverTransport             
- Stopped.
INFO   | jvm 1    | 2009/03/16 21:25:43 | DEBUG TcpTransport                  
- Stopping transport tcp:///10.87.129.196:61616
INFO   | jvm 1    | 2009/03/16 21:26:00 | DEBUG AMQPersistenceAdapter         
- Checkpoint started.
INFO   | jvm 1    | 2009/03/16 21:26:00 | DEBUG AMQPersistenceAdapter         
- Checkpoint done.
INFO   | jvm 1    | 2009/03/16 21:26:13 | DEBUG ActiveMQMessageConsumer       
- ID:LOCALMQ-3675-1236961500048-2:0:1:1 received message: MessageDispatch
{...}
INFO   | jvm 1    | 2009/03/16 21:26:13 | DEBUG EndpointMessageListener       
- Endpoint[localMQ:topic:Topic1?clientId=...&subscriptionDurable=true]
receiving JMS message: ActiveMQTextMessage {...}
INFO   | jvm 1    | 2009/03/16 21:26:13 | DEBUG FailoverTransport             
- Waking up reconnect task
INFO   | jvm 1    | 2009/03/16 21:26:13 | DEBUG FailoverTransport             
- Started.
INFO   | jvm 1    | 2009/03/16 21:26:13 | DEBUG FailoverTransport             
- Waking up reconnect task
INFO   | jvm 1    | 2009/03/16 21:26:13 | DEBUG FailoverTransport             
- Attempting connect to: tcp://10.87.129.196:61616
INFO   | jvm 1    | 2009/03/16 21:26:13 | DEBUG WireFormatNegotiator          
- Sending: WireFormatInfo { version=3, properties={CacheSize=1024,
CacheEnabled=true, SizePrefixDisabled=false,
MaxInactivityDurationInitalDelay=10000, TcpNoDelayEnabled=true,
MaxInactivityDuration=30000, TightEncodingEnabled=true,
StackTraceEnabled=true}, magic=[A,c,t,i,v,e,M,Q]}
INFO   | jvm 1    | 2009/03/16 21:26:13 | DEBUG WireFormatNegotiator          
- Received WireFormat: WireFormatInfo { version=3,
properties={CacheSize=1024, CacheEnabled=true, SizePrefixDisabled=false,
MaxInactivityDurationInitalDelay=10000, TcpNoDelayEnabled=true,
MaxInactivityDuration=30000, TightEncodingEnabled=true,
StackTraceEnabled=true}, magic=[A,c,t,i,v,e,M,Q]}
INFO   | jvm 1    | 2009/03/16 21:26:13 | DEBUG WireFormatNegotiator          
- tcp:///10.87.129.196:61616 before negotiation: OpenWireFormat{version=3,
cacheEnabled=false, stackTraceEnabled=false, tightEncodingEnabled=false,
sizePrefixDisabled=false}
INFO   | jvm 1    | 2009/03/16 21:26:13 | DEBUG WireFormatNegotiator          
- tcp:///10.87.129.196:61616 after negotiation: OpenWireFormat{version=3,
cacheEnabled=true, stackTraceEnabled=true, tightEncodingEnabled=true,
sizePrefixDisabled=false}
INFO   | jvm 1    | 2009/03/16 21:26:13 | DEBUG FailoverTransport             
- Connection established
INFO   | jvm 1    | 2009/03/16 21:26:13 | INFO  FailoverTransport             
- Successfully connected to tcp://10.87.129.196:61616
INFO   | jvm 1    | 2009/03/16 21:26:13 | DEBUG JmsConfiguration$2            
- Executing callback on JMS Session: ActiveMQSession
{id=ID:LOCALMQ-3675-1236961500048-2:219:1,started=false}
INFO   | jvm 1    | 2009/03/16 21:26:13 | DEBUG JmsProducer                   
- Endpoint[centralMQ:topic:Topic1] sending JMS message: ActiveMQTextMessage
{...}
INFO   | jvm 1    | 2009/03/16 21:26:13 | DEBUG JmsConfiguration$2            
- Sending created message: ActiveMQTextMessage {...}
INFO   | jvm 1    | 2009/03/16 21:26:13 | DEBUG ActiveMQSession               
- ID:LOCALMQ-3675-1236961500048-2:219:1 sending message: ActiveMQTextMessage
{...}
INFO   | jvm 1    | 2009/03/16 21:26:13 | DEBUG FailoverTransport             
- Stopped.
INFO   | jvm 1    | 2009/03/16 21:26:13 | DEBUG TcpTransport                  
- Stopping transport tcp:///10.87.129.196:61616
INFO   | jvm 1    | 2009/03/16 21:26:14 | DEBUG ActiveMQMessageConsumer       
- ID:LOCALMQ-3675-1236961500048-2:0:1:1 received message: MessageDispatch
{...}
INFO   | jvm 1    | 2009/03/16 21:26:14 | DEBUG EndpointMessageListener       
- Endpoint[localmq:topic:Topic1?clientId=...&subscriptionDurable=true]
receiving JMS message: ActiveMQTextMessage {...}
INFO   | jvm 1    | 2009/03/16 21:26:15 | DEBUG FailoverTransport             
- Waiting 10 ms before attempting connection. 
INFO   | jvm 1    | 2009/03/16 21:26:15 | Exception in thread "ActiveMQ
Failover Worker: 1889455" java.lang.NullPointerException
INFO   | jvm 1    | 2009/03/16 21:26:15 | 	at
org.apache.activemq.transport.failover.FailoverTransport$2.iterate(FailoverTransport.java:124)
INFO   | jvm 1    | 2009/03/16 21:26:15 | 	at
org.apache.activemq.thread.DedicatedTaskRunner.runTask(DedicatedTaskRunner.java:98)
INFO   | jvm 1    | 2009/03/16 21:26:15 | 	at
org.apache.activemq.thread.DedicatedTaskRunner$1.run(DedicatedTaskRunner.java:36)
INFO   | jvm 1    | 2009/03/16 21:26:15 | DEBUG FailoverTransport             
- Waking up reconnect task
INFO   | jvm 1    | 2009/03/16 21:26:15 | DEBUG FailoverTransport             
- Started.
INFO   | jvm 1    | 2009/03/16 21:26:15 | DEBUG FailoverTransport             
- Waking up reconnect task
INFO   | jvm 1    | 2009/03/16 21:27:00 | DEBUG AMQPersistenceAdapter         
- Checkpoint started.
INFO   | jvm 1    | 2009/03/16 21:27:00 | DEBUG AMQPersistenceAdapter         
- Checkpoint done.
INFO   | jvm 1    | 2009/03/16 21:28:00 | DEBUG AMQPersistenceAdapter         
- Checkpoint started.
---------------------------------------------


Basically, it was able to deliver a message (and few more prior to that time
period), but for another message that is very close (in time) to the
previous message it is running into a NullPointerException, after that it
stops functioning totally.

I took a brief look at the FailoverTransport.java code, I'm not an expert on
the AMQ code, but I suspect that FailoverTransport.java reconnectTask member
variable is attempted to be used by the task-runner thread before it was
completely initialized  (basically race conditions without proper
synchronization)

I can provide more details on our network topology if it is required. I
searched around but didn't find any related issues or bugs. Does anyone know
if this is a known issue, and which version this is going to be addressed.
If not I'll open a JIRA.

Appreciate your help.

cheers
- mdasari


-- 
View this message in context: http://www.nabble.com/FailoverTransport-stops-working-after-a-while-tp22851122p22851122.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.


Mime
View raw message