qpid-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From fadams <fraser.ad...@blueyonder.co.uk>
Subject Opinions sought on handling some edge cases.
Date Fri, 01 Jul 2011 11:22:46 GMT
If one starts to play with qpid in any significant way there seems to be a
number of edge cases that commonly crop up. I've got a few scenarios and
it'd be really interesting to hear opinions and how others have
solved/worked around them.

Some of the scenarios are somewhat interrelated I guess. Sorry, it's ended
up getting quite long.......

1. "rogue" (possibly just slow) consumers killing all data flows.
In this scenario we may have a producer publishing to topic or headers
exchange and a number of consumers. If one of these consumers fails or slows
down and stops consuming eventually its queue will fill up and an exception
will be thrown. This will either directly hit the producer, or in a
federated environment the link will "blow". In either case data stops
flowing to ALL consumers, which is clearly undesirable.

I think that the "standard" solution to this is to use circular/ring queues
however a) the default policy is reject and b) ring queues need to fit into
memory. I'll cover these below in other scenarios.

Is the current "perceived wisdom" that ring queues are the best/only way to
prevent slow consumers killing data flow for all?

I have been thinking around this. Given that qpid 0.10 supports a
queueThresholdExceeded it should be possible to write a QMF client to unbind
queues that are filling up and potentially do other useful things.

2. default policy is reject
Given scenario 1 it's perhaps a pity that the default policy is reject.
Indeed I don't believe that it's possible to change the default policy on
the broker which means that in an operational environment one has to rely on
subscribers to explicitly set the policy to ring. This seems risky to me!!!

I believe that it's possible to enforce this using ACL, but this then
requires authentication to be enabled (in our environment we were hoping to
go with a self service, trust and verify - e.g. audit based approach).
Possibly we'll need to rethink that. Has anyone else had experience here?

Again, I guess using queueThresholdExceeded to unbind queues filling up
might help - I wouldn't then need to enforce a particular policy I could
simply implement what amounts to a fuse.

I'm interested to hear debate on what people think is the best strategy.

3. ring queues need to fit into memory.
So the position that I'm taking "architecturally" in our system is that it's
not the role of the data distribution system to buffer to protect against
poorly designed end consumers (a bit of elastic to cope with burstiness is
OK) so I'm expecting them to be adequately scaled and provide
clustering/failover such that end consumers really ought not to be slow
consumers unless things get really bad. So for the most part either ring
queues or the fuse I described above ought to be adequate.

However I've got a federated topology and in some cases the WAN link between
the source broker to the destination broker might not be exactly reliable.

Here's where scenario 3 causes pain. If the WAN goes down the queue on the
source broker starts to fill and eventually old data gets overwritten by
new.

If I use a circular buffer the maximum buffering capacity is very much
dependent on available memory. With a persistent queue things are about as
bad as the maximum size I believe is 128GB but I'll eventually fill it and I
can't make it behave in a ring manner. I guess if I was willing to throw
cash at the problem I could buy a box with lots of memory and have more than
128GB stored, either way there's a bit of a problem.

I was wondering about using queueThresholdExceeded again. I guess that it
may be possible to detect when the queue hits it's limit and automatically
start a client (clearly on the appropriate side of the WAN) to pull off the
messages that are backing up and write them out to disc. If I record the
queue name I think that once I detect the WAN connection reestablished I
could have my protection client write back to the queue via the direct
exchange.

Does this sound do-able? Can anyone suggest a better solution to this type
of problem.

It's a bit of a shame to be having to write clients to do this sort of thing
though. The qpid persistence mechanism is pretty cool and very efficient,
but it does seem quite limited. Perhaps this isn't as common an edge case as
I imagine?

4. Exceptions with asynchronous producers.
So asynchronous producers are cool, but if an exception is thrown e.g. on a
resource limit exceeded how do I work out exactly what has been sent to the
broker. What I mean is on the client side I call send() and that returns
when the data reaches the client runtime NOT when it has successfully hit
the broker.

Now I can make things synchronous, but that hoses performance and I could
use transactions, as if commit returns I know my data has hit the broker,
but if tx size is small performance gets hit much like synchronous and if tx
size is too high throughput can get "lumpy"

Is it possibly to find out how many messages are pending beyond "send" in
the client runtime so I know from whence I need to resend my messages. I'm
particularly interested in the Java JMS API as this doesn't have some of the
subtle nuances/control of the C++ messaging API (but I'm interested in how
to do it from C++ too).










--
View this message in context: http://apache-qpid-users.2158936.n2.nabble.com/Opinions-sought-on-handling-some-edge-cases-tp6537357p6537357.html
Sent from the Apache Qpid users mailing list archive at Nabble.com.

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:users-subscribe@qpid.apache.org


Mime
View raw message