From dev-return-17508-apmail-activemq-dev-archive=activemq.apache.org@activemq.apache.org Sat Nov 07 15:13:19 2009 Return-Path: Delivered-To: apmail-activemq-dev-archive@www.apache.org Received: (qmail 23290 invoked from network); 7 Nov 2009 15:13:19 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 7 Nov 2009 15:13:19 -0000 Received: (qmail 48367 invoked by uid 500); 7 Nov 2009 15:13:18 -0000 Delivered-To: apmail-activemq-dev-archive@activemq.apache.org Received: (qmail 48302 invoked by uid 500); 7 Nov 2009 15:13:18 -0000 Mailing-List: contact dev-help@activemq.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@activemq.apache.org Delivered-To: mailing list dev@activemq.apache.org Received: (qmail 48292 invoked by uid 99); 7 Nov 2009 15:13:17 -0000 Received: from nike.apache.org (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 07 Nov 2009 15:13:17 +0000 X-ASF-Spam-Status: No, hits=-2000.0 required=10.0 tests=ALL_TRUSTED X-Spam-Check-By: apache.org Received: from [140.211.11.140] (HELO brutus.apache.org) (140.211.11.140) by apache.org (qpsmtpd/0.29) with ESMTP; Sat, 07 Nov 2009 15:13:14 +0000 Received: from brutus (localhost [127.0.0.1]) by brutus.apache.org (Postfix) with ESMTP id D1990234C052 for ; Sat, 7 Nov 2009 07:12:53 -0800 (PST) Message-ID: <1874035348.1257606773852.JavaMail.jira@brutus> Date: Sat, 7 Nov 2009 07:12:53 -0800 (PST) From: "Dominic Tootell (JIRA)" To: dev@activemq.apache.org Subject: [jira] Commented: (AMQ-2475) If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks In-Reply-To: <534568635.1257242632754.JavaMail.jira@brutus> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: ae95407df07c98740808b2ef9da0087c X-Virus-Checked: Checked by ClamAV on apache.org [ https://issues.apache.org/activemq/browse/AMQ-2475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=55210#action_55210 ] Dominic Tootell commented on AMQ-2475: -------------------------------------- I had an investigate into attempted to patch this locally in activemq-core, on a fusesource 5.3.0.4 (MacOSX 10.6.1). I've run the following tests on the patch, will I'll attach: (the patch diffs, the patched .java and the broker xml I used in testing): The test cases I've run overnight and this morning/afternoon are: - Virtual Topic ( VirtualTopic.iplayer -> Consumer.A.VirtualTopic.iplayer) - 3 x Producer, 4,000,000 messages each onto Virtual Topic (12million in total) - 1 x Consumer - 100mb tmp_store limit - Virtual Topic ( VirtualTopic.iplayer -> Consumer.A.VirtualTopic.iplayer) - 6 x Producer, 2,000,000 messages each onto Virtual Topic (12million in total) - 1 x Consumer - 512mb tmp_store limit The tmp_storage was definitely limiting ok, and niether the broker, producer or consumer blocked: du -sh of the tmp_storage area: {code} dominic-tootells-macbook-pro:data dominict$ du -sh * 96M journal 48K kr-store 0B lock 512M tmp-test-broker dominic-tootells-macbook-pro:data dominict$ du -sh * 96M journal 48K kr-store 0B lock 483M tmp-test-broker dominic-tootells-macbook-pro:data dominict$ du -sh * 96M journal 48K kr-store 0B lock 490M tmp-test-broker dominic-tootells-macbook-pro:data dominict$ du -sh * 64M journal 48K kr-store 0B lock 38M tmp-test-broker dominic-tootells-macbook-pro:data dominict$ {code} I've also run the junit provided by Martin; this ran ok too; with no blockage. I shall attach the potential patches. I haven't run any other tests against the patches; to see if they potentially cause any other unforeseen issues (i.e. normal persistent queue - will do this later on) cheers /dom > If tmp message store fills up, broker can deadlock due to while producers wait on disk space and consumers wait on acks > ----------------------------------------------------------------------------------------------------------------------- > > Key: AMQ-2475 > URL: https://issues.apache.org/activemq/browse/AMQ-2475 > Project: ActiveMQ > Issue Type: Bug > Components: Broker, Message Store, Transport > Affects Versions: 5.3.0 > Environment: Tested on Windows XP with JDK 1.60_13, but fairly sure it will be an issue on all platforms > Reporter: Martin Murphy > Assignee: Rob Davies > Attachments: hangtest.zip > > > I will attach a simple project that shows this. In the test the tmp space is set to 32 MB and two threads are created. One thread will constantly produce 1KB messages and the other consumes these, but sleeps for 100ms, note that producer flow control is turned off as well. The goal here is to ensure that the producers block while the consumers read the rest of the messages from the broker and catch up, this in turn frees up the disk space and allows the producer to send more messages. This config means that you can bound the broker based on disk space rather than memory usage. > Unfortunately in this test using topics while the broker is reading in the message from the producer it has to lock the matched list it is adding it to. This is an abstract from the Topic's point of view and doesn't realize that the file may block based on the file system. > {code} > public void add(MessageReference node) throws Exception { //... snip ... > if (maximumPendingMessages != 0) { > synchronized (matchedListMutex) { // We have this mutex > matched.addMessageLast(node); // ends up waiting for space > // NOTE - be careful about the slaveBroker! > if (maximumPendingMessages > 0) { > {code} > Meanwhile the consumer is sending acknowledgements for the 10 messages it just read in (the configured prefetch) from the same topic, but since they also modify the same list in the topic this waits as well on the mutex held to service the producer: > {code} > private void dispatchMatched() throws IOException { > synchronized (matchedListMutex) { // never gets passed here. > if (!matched.isEmpty() && !isFull()) { > {code} > This is a fairly classic deadlock. The trick is now how to resolve this given the fact that the topic isn't aware that it's list may need to wait for the file system to clean up. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.