activemq-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gary Tully <gary.tu...@gmail.com>
Subject Re: Failover Question
Date Fri, 28 May 2010 08:55:42 GMT
Your first assumption about automatic redelivery is correct. Any unacked
message will be redelivered. They may not get redelivered to the same
consumer though, any consumer will do from the brokers perspective. The
first precondition though is the recognition of the death of the consumers
connection. If the broker is in a socket write this may take a little time
to timeout.
To this end, Joe's comment about the inactivity timeout is relevant. One way
to check if this is the problem is to validate the state of the brokers
sockets when you do your testing, before the client machine dies there will
be a visible connection (netstat -a), after it dies, the connection should
disappear, if it does not then configuring the timeout will help.

This inactivity duration value is negotiated between the client and broker
so it needs to be configured on both ends to take effect.
Have a peek at this test to see the config in action:
http://svn.apache.org/viewvc/activemq/trunk/activemq-core/src/test/java/org/apache/activemq/transport/tcp/InactivityMonitorTest.java?view=markup

On 27 May 2010 17:15, <daniel.stucky@attensity.com> wrote:

> Hi ActiveMQ Team,
>
>
>
> in the eclipse open source project SMILA we use ActiveMQ (version 5.3.2)
> to implement a producer/consumer pattern with JMS. The basic setup is as
> follows:
>
> -          the software runs in a cluster of machines (usually between 4
> and 16)
>
> -          we use the Pure Master/Slave configuration for Queue failover
>
> -          a producer creates a large data chunk in a data repository
> and creates a JMS message containing the Id of the created chunk of data
>
> -          a consumer receives a JMS message and processes the data
> chunk with the given Id. Some consumers also function an producers as
> they create a new data chunk and another JMS message
>
> -          all machines in the cluster work as producers and consumers
>
>
>
>
>
> In general this works fine, but we have problems on a machine failure.
> For simplicity assume that one machine (except for the Master or Slave)
> has a hardware failure and crashes. Also assume that this machine was
> currently processing a received JMS message. The Session from which the
> message was received was not committed yet, as the session is only
> committed if the processing of the data was successful. Otherwise it is
> rolled back.
>
> Now as the machine crashes the session is neither committed nor rolled
> back. How can we assure that any messages that were delivered but not
> committed or rolled back are redelivered or put into the DLQ?
>
>
>
>
>
> Our first assumption was that if the connection of a session drops all
> not committed messages of that session are automatically redelivered.
> Unfortunately this was not the case. Does this only work in certain
> scenarios with specific settings ?
>
>
>
>
>
> The second  idea was to set TTL for each message, so that when TTL is
> reached the message goes into the DLQ and can be consumed there (e.g. by
> another consumer that creates a copy of the message in the actual
> queue). This would automatically cover the machine crash described
> above, as sending no commit or rollback eventually leads to reaching the
> set TTL of the message. However during tests we had strange behavior for
> messages that were processed by the crashing machine:
>
> -          some messages were handled  correctly (they were moved to the
> DLQ)
>
> -          other messages simply disappeared, in JMX console these
> messages were shown as dequeued which should only be the case if the
> session was committed. There were no exceptions in the log files.
>
>
>
> Is there anything that has to be addressed, either in the configuration
> or our code for this to work correctly?
>
>
>
> Besides this TTL has a drawback, as it is set when the message is
> created. The processing of our data takes quite a while and we also have
> to assure the processing in a certain time frame. Producers are
> generally faster than Consumers, so the number of enqueued messages
> increases. So by setting TTL we cannot assure that a message is consumed
> in a certain time frame but only that it is available for the set time.
> Are there any mechanisms that would allow us the set a "processing
> timeout" or "commit timeout" by that a message must be committed or it
> is sent to the DLQ ?
>
>
>
> BTW, what about the parameter maxInactivityDuration ? Does it have any
> effect on opened sessions/transactions ? We also set this but it did not
> seem to have any effect.
>
>
>
>
>
> Some information on our environment:
>
> -          ActiveMQ 5.3.2
>
> -          JDK 1.6.0_20
>
> -          Equinox OSGi container (eclipse 3.5)
>
> -          Linux Open Suse  11.1
>
> -          Connection-URL:
> failover://(tcp://masterhost:61616,tcp://slavehost:61616)?randomize=fals
> e
>
>
>
>
>
> It would be great if you could share your thoughts on this issue.
>
>
>
> Bye,
>
> Daniel
>
>
>
>
>
>


-- 
http://blog.garytully.com

Open Source Integration
http://fusesource.com

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message