Return-Path: X-Original-To: apmail-qpid-users-archive@www.apache.org Delivered-To: apmail-qpid-users-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 295CB4DB1 for ; Fri, 1 Jul 2011 11:23:19 +0000 (UTC) Received: (qmail 6073 invoked by uid 500); 1 Jul 2011 11:23:18 -0000 Delivered-To: apmail-qpid-users-archive@qpid.apache.org Received: (qmail 5790 invoked by uid 500); 1 Jul 2011 11:23:16 -0000 Mailing-List: contact users-help@qpid.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@qpid.apache.org Delivered-To: mailing list users@qpid.apache.org Received: (qmail 5775 invoked by uid 99); 1 Jul 2011 11:23:13 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Jul 2011 11:23:13 +0000 X-ASF-Spam-Status: No, hits=2.0 required=5.0 tests=SPF_NEUTRAL,URI_HEX X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: 216.139.236.26 is neither permitted nor denied by domain of fraser.adams@blueyonder.co.uk) Received: from [216.139.236.26] (HELO sam.nabble.com) (216.139.236.26) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 01 Jul 2011 11:23:07 +0000 Received: from jim.nabble.com ([192.168.236.80]) by sam.nabble.com with esmtp (Exim 4.72) (envelope-from ) id 1Qcbo6-0006EC-F6 for users@qpid.apache.org; Fri, 01 Jul 2011 04:22:46 -0700 Date: Fri, 1 Jul 2011 04:22:46 -0700 (PDT) From: fadams To: users@qpid.apache.org Message-ID: <1309519366458-6537357.post@n2.nabble.com> Subject: Opinions sought on handling some edge cases. MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Virus-Checked: Checked by ClamAV on apache.org If one starts to play with qpid in any significant way there seems to be a number of edge cases that commonly crop up. I've got a few scenarios and it'd be really interesting to hear opinions and how others have solved/worked around them. Some of the scenarios are somewhat interrelated I guess. Sorry, it's ended up getting quite long....... 1. "rogue" (possibly just slow) consumers killing all data flows. In this scenario we may have a producer publishing to topic or headers exchange and a number of consumers. If one of these consumers fails or slows down and stops consuming eventually its queue will fill up and an exception will be thrown. This will either directly hit the producer, or in a federated environment the link will "blow". In either case data stops flowing to ALL consumers, which is clearly undesirable. I think that the "standard" solution to this is to use circular/ring queues however a) the default policy is reject and b) ring queues need to fit into memory. I'll cover these below in other scenarios. Is the current "perceived wisdom" that ring queues are the best/only way to prevent slow consumers killing data flow for all? I have been thinking around this. Given that qpid 0.10 supports a queueThresholdExceeded it should be possible to write a QMF client to unbind queues that are filling up and potentially do other useful things. 2. default policy is reject Given scenario 1 it's perhaps a pity that the default policy is reject. Indeed I don't believe that it's possible to change the default policy on the broker which means that in an operational environment one has to rely on subscribers to explicitly set the policy to ring. This seems risky to me!!! I believe that it's possible to enforce this using ACL, but this then requires authentication to be enabled (in our environment we were hoping to go with a self service, trust and verify - e.g. audit based approach). Possibly we'll need to rethink that. Has anyone else had experience here? Again, I guess using queueThresholdExceeded to unbind queues filling up might help - I wouldn't then need to enforce a particular policy I could simply implement what amounts to a fuse. I'm interested to hear debate on what people think is the best strategy. 3. ring queues need to fit into memory. So the position that I'm taking "architecturally" in our system is that it's not the role of the data distribution system to buffer to protect against poorly designed end consumers (a bit of elastic to cope with burstiness is OK) so I'm expecting them to be adequately scaled and provide clustering/failover such that end consumers really ought not to be slow consumers unless things get really bad. So for the most part either ring queues or the fuse I described above ought to be adequate. However I've got a federated topology and in some cases the WAN link between the source broker to the destination broker might not be exactly reliable. Here's where scenario 3 causes pain. If the WAN goes down the queue on the source broker starts to fill and eventually old data gets overwritten by new. If I use a circular buffer the maximum buffering capacity is very much dependent on available memory. With a persistent queue things are about as bad as the maximum size I believe is 128GB but I'll eventually fill it and I can't make it behave in a ring manner. I guess if I was willing to throw cash at the problem I could buy a box with lots of memory and have more than 128GB stored, either way there's a bit of a problem. I was wondering about using queueThresholdExceeded again. I guess that it may be possible to detect when the queue hits it's limit and automatically start a client (clearly on the appropriate side of the WAN) to pull off the messages that are backing up and write them out to disc. If I record the queue name I think that once I detect the WAN connection reestablished I could have my protection client write back to the queue via the direct exchange. Does this sound do-able? Can anyone suggest a better solution to this type of problem. It's a bit of a shame to be having to write clients to do this sort of thing though. The qpid persistence mechanism is pretty cool and very efficient, but it does seem quite limited. Perhaps this isn't as common an edge case as I imagine? 4. Exceptions with asynchronous producers. So asynchronous producers are cool, but if an exception is thrown e.g. on a resource limit exceeded how do I work out exactly what has been sent to the broker. What I mean is on the client side I call send() and that returns when the data reaches the client runtime NOT when it has successfully hit the broker. Now I can make things synchronous, but that hoses performance and I could use transactions, as if commit returns I know my data has hit the broker, but if tx size is small performance gets hit much like synchronous and if tx size is too high throughput can get "lumpy" Is it possibly to find out how many messages are pending beyond "send" in the client runtime so I know from whence I need to resend my messages. I'm particularly interested in the Java JMS API as this doesn't have some of the subtle nuances/control of the C++ messaging API (but I'm interested in how to do it from C++ too). -- View this message in context: http://apache-qpid-users.2158936.n2.nabble.com/Opinions-sought-on-handling-some-edge-cases-tp6537357p6537357.html Sent from the Apache Qpid users mailing list archive at Nabble.com. --------------------------------------------------------------------- Apache Qpid - AMQP Messaging Implementation Project: http://qpid.apache.org Use/Interact: mailto:users-subscribe@qpid.apache.org