activemq-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Joe Fernandez <joe.fernan...@ttmsolutions.com>
Subject Re: Failover Question
Date Fri, 28 May 2010 00:46:42 GMT

The InactivityMonitor should have detected the failed connection. What value
did you assign to maxInactivityDuration? By default it is set to 30000ms. 

Joe
ActiveMQ Ref Guide - http://bit.ly/AMQRefGuide
 

daniel.stucky-2 wrote:
> 
> Hi ActiveMQ Team,
> 
>  
> 
> in the eclipse open source project SMILA we use ActiveMQ (version 5.3.2)
> to implement a producer/consumer pattern with JMS. The basic setup is as
> follows:
> 
> -          the software runs in a cluster of machines (usually between 4
> and 16)
> 
> -          we use the Pure Master/Slave configuration for Queue failover
> 
> -          a producer creates a large data chunk in a data repository
> and creates a JMS message containing the Id of the created chunk of data
> 
> -          a consumer receives a JMS message and processes the data
> chunk with the given Id. Some consumers also function an producers as
> they create a new data chunk and another JMS message
> 
> -          all machines in the cluster work as producers and consumers
> 
>  
> 
>  
> 
> In general this works fine, but we have problems on a machine failure.
> For simplicity assume that one machine (except for the Master or Slave)
> has a hardware failure and crashes. Also assume that this machine was
> currently processing a received JMS message. The Session from which the
> message was received was not committed yet, as the session is only
> committed if the processing of the data was successful. Otherwise it is
> rolled back.
> 
> Now as the machine crashes the session is neither committed nor rolled
> back. How can we assure that any messages that were delivered but not
> committed or rolled back are redelivered or put into the DLQ?
> 
>  
> 
>  
> 
> Our first assumption was that if the connection of a session drops all
> not committed messages of that session are automatically redelivered.
> Unfortunately this was not the case. Does this only work in certain
> scenarios with specific settings ?
> 
>  
> 
>  
> 
> The second  idea was to set TTL for each message, so that when TTL is
> reached the message goes into the DLQ and can be consumed there (e.g. by
> another consumer that creates a copy of the message in the actual
> queue). This would automatically cover the machine crash described
> above, as sending no commit or rollback eventually leads to reaching the
> set TTL of the message. However during tests we had strange behavior for
> messages that were processed by the crashing machine:
> 
> -          some messages were handled  correctly (they were moved to the
> DLQ)
> 
> -          other messages simply disappeared, in JMX console these
> messages were shown as dequeued which should only be the case if the
> session was committed. There were no exceptions in the log files.
> 
>  
> 
> Is there anything that has to be addressed, either in the configuration
> or our code for this to work correctly?
> 
>  
> 
> Besides this TTL has a drawback, as it is set when the message is
> created. The processing of our data takes quite a while and we also have
> to assure the processing in a certain time frame. Producers are
> generally faster than Consumers, so the number of enqueued messages
> increases. So by setting TTL we cannot assure that a message is consumed
> in a certain time frame but only that it is available for the set time.
> Are there any mechanisms that would allow us the set a "processing
> timeout" or "commit timeout" by that a message must be committed or it
> is sent to the DLQ ?
> 
>  
> 
> BTW, what about the parameter maxInactivityDuration ? Does it have any
> effect on opened sessions/transactions ? We also set this but it did not
> seem to have any effect.
> 
>  
> 
>  
> 
> Some information on our environment:
> 
> -          ActiveMQ 5.3.2
> 
> -          JDK 1.6.0_20
> 
> -          Equinox OSGi container (eclipse 3.5)
> 
> -          Linux Open Suse  11.1
> 
> -          Connection-URL:
> failover://(tcp://masterhost:61616,tcp://slavehost:61616)?randomize=fals
> e
> 
>  
> 
>  
> 
> It would be great if you could share your thoughts on this issue.
> 
>  
> 
> Bye,
> 
> Daniel
> 
>  
> 
>  
> 
> 
> 

-- 
View this message in context: http://old.nabble.com/Failover-Question-tp28696505p28701582.html
Sent from the ActiveMQ - User mailing list archive at Nabble.com.


Mime
View raw message