qpid-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Ross <tr...@redhat.com>
Subject Re: Network Outage Causes Message Loss on Federated Routes
Date Fri, 06 Nov 2009 14:10:21 GMT
On 11/05/2009 11:55 AM, Cullen Davis wrote:
> Our Qpid based product will be deployed with brokers being federated brokers networks
that are characterized as "disconnected, interrupted, and low-bandwidth".   We are using a
dedicated hardware based network shaper to simulate the network conditions in order to test
our solution.
>
> Qpid is performing very well in the tests involving high bit error rates, packet loss,
and high latencies.  However, our solution is not meeting threshold objectives in tests involving
extended network outages (packet loss = 100%).
>
> Our solution utilizes Qpid 0.5 C++ brokers and clients running on RedHat Enterprise Linux
5.4.  The brokers are utilizing direct exchanges and have been federate as follows:
>    qpid-route  --durable dynamic add brokerB brokerA fed.direct
>
> The qpid-route command created a new queue, named "bridge-queue" at brokerA.  The new
queue had queue properties of durable=False, exclusive=True and autoDelete=True.
>
> Our test begin with 1000 messages being published into broker A at a rate of 1 per second.
 The network connection between broker A and broker B is set to run at 56kbps for 5 minutes
and then degrade to a network outage stage (100% packet loss) for 15 minutes.
>
> The test begins and broker B starts receiving the messages through the federated route
at a frequency of 1 per second.  About seven minutes into the network outage stage, broker
A throws a timeout error:
>
>    Connection timed out: closing
>    DISCONNECTED 150.nnn.nnn.nnn (broker B's ip)
>
> This results in the bridge-queue on broker A being deleted.  When the network connection
is re-established, the bridge-queue is rebuilt, but none of the messages that were published
into Broker A during the network outage were federated to broker B.  Essentially, this means
that broker B never receives more than half of the messages received by broker A.
>
> The current theory is that the federated route is backed by a bridge-queue with a autoDelete
property of true.  When the network outage occurs, the queue is deleted and the message counts
are flushed.  The durable flag on the route causes the bridge-queue to be rebuilt when the
brokers reconnect, but there is no way for the bridge-queue to establish what messages have
not been federated.  Could setting the autoDelete property fix the problem?   I am unsure
of how to properly set this property on a "system management" queue.
>
> Any thoughts on how to properly configure a broker link / route that can survive extended
network outages would be greatly appreciated.
>
> Cullen J. Davis
> CommIT Enterprises, Inc.
>
> ---------------------------------------------------------------------
> Apache Qpid - AMQP Messaging Implementation
> Project:      http://qpid.apache.org
> Use/Interact: mailto:users-subscribe@qpid.apache.org
>
>    
Your theory is correct.  An "exchange" route causes a temporary transit 
queue to be created to hold messages waiting to be sent from broker to 
broker.  Even though the route is durable, meaning it will be 
re-established after a broker restart, the temporary queue is not (it is 
exclusive/auto-delete) and any messages in the queue when a restart 
occurs will be lost.

You can use a "queue" route where rather than connecting to a remote 
exchange, the destination broker subscribes to an existing queue.  This 
queue can be non exclusive and durable.  Be sure to use the --ack N 
option in qpid-route where N is a number greater than zero.  This will 
cause the inter-broker route to use message acknowledgement in such a 
way that recovery will be clean (i.e. the source broker will not discard 
messages from the queue until they are acknowledged by the destination 
broker).

The downside of the queue route solution is that you don't get the 
dynamic binding behavior.  It is possible (though not implemented) to 
use durable transit queues when durable routes are created so that no 
messages would ever be lost in the event of broker failure.

-Ted


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:users-subscribe@qpid.apache.org


Mime
View raw message