Return-Path: X-Original-To: apmail-activemq-dev-archive@www.apache.org Delivered-To: apmail-activemq-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id C6DE317A4E for ; Thu, 9 Apr 2015 21:55:12 +0000 (UTC) Received: (qmail 12579 invoked by uid 500); 9 Apr 2015 21:55:12 -0000 Delivered-To: apmail-activemq-dev-archive@activemq.apache.org Received: (qmail 12516 invoked by uid 500); 9 Apr 2015 21:55:12 -0000 Mailing-List: contact dev-help@activemq.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@activemq.apache.org Delivered-To: mailing list dev@activemq.apache.org Received: (qmail 12504 invoked by uid 99); 9 Apr 2015 21:55:12 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 09 Apr 2015 21:55:12 +0000 Date: Thu, 9 Apr 2015 21:55:12 +0000 (UTC) From: "Timothy Bish (JIRA)" To: dev@activemq.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (AMQ-5712) Broker can deadlock when using queues while producers wait on disk space MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/AMQ-5712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14488355#comment-14488355 ] Timothy Bish commented on AMQ-5712: ----------------------------------- I believe I've run into this while doing some testing for AMQP flow control issues. Once I did some debugging I remembered this issue while looking into the stack traces. > Broker can deadlock when using queues while producers wait on disk space > ------------------------------------------------------------------------ > > Key: AMQ-5712 > URL: https://issues.apache.org/jira/browse/AMQ-5712 > Project: ActiveMQ > Issue Type: Bug > Components: Broker > Affects Versions: 5.11.1 > Reporter: Christopher L. Shannon > > I am experiencing a deadlock when using a Queue with non-persistent messages. The queue has a cursor high memory water mark set (right now at 70%). When a producer is producing messages quickly to the queue and that limit gets hit, the broker can deadlock. I have tried setting producerWindowSize and alwaysSyncSend which did not seem to help. When the broker hits that limit, I am unable to do things like purge the queue. Consumers can also deadlock as well. > Note that this appears to be the same issue as described in this ticket here: AMQ-2475 . The difference is that I am using a Queue and not a Topic and the fix for this appears to only have been for Topics. > The problem appears to be in the Queue class on line 1852 inside the {{cursorAdd}} method. The method being called is {{return messages.addMessageLast(msg);}} which will block indefinitely if there is no space available, which in turn ties up the {{messagesLock}} from being used by any other threads. We have seen a deadlock where consumers can't consume because they are waiting on this lock. It looks like in AMQ-2475 part of the fix was to replace {{messages.addMessageLast(msg)}} with {{messages.tryAddMessageLast(msg, 10)}}. I also noticed that not all of the message cursors support {{tryAddMessageLast}}, which could be a problem. {{FilePendingMessageCursor}} implements it but the rest of the cursors (notably {{StoreQueueCursor}}) simply delegate back to {{addMessageLast}} in the parent class. So part of this fix may require implementing {{tryAddMessageLast}} across more cursors. > Here is part of the thread dump showing the stuck producer: > {code} > "ActiveMQ Transport: ssl:///192.168.3.142:38589" daemon prio=10 tid=0x00007fb46c006000 nid=0x3b1a runnable [0x00007fb4b8a0d000] > java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00000000cfb13cd0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) > at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2176) > at org.apache.activemq.usage.Usage.waitForSpace(Usage.java:103) > at org.apache.activemq.usage.Usage.waitForSpace(Usage.java:90) > at org.apache.activemq.usage.Usage.waitForSpace(Usage.java:80) > at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.tryAddMessageLast(FilePendingMessageCursor.java:235) > - locked <0x00000000d2015ee0> (a org.apache.activemq.broker.region.cursors.FilePendingMessageCursor) > at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.addMessageLast(FilePendingMessageCursor.java:207) > - locked <0x00000000d2015ee0> (a org.apache.activemq.broker.region.cursors.FilePendingMessageCursor) > at org.apache.activemq.broker.region.cursors.StoreQueueCursor.addMessageLast(StoreQueueCursor.java:97) > - locked <0x00000000d1f20908> (a org.apache.activemq.broker.region.cursors.StoreQueueCursor) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)