Return-Path: X-Original-To: apmail-activemq-dev-archive@www.apache.org Delivered-To: apmail-activemq-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 3637E173D4 for ; Mon, 13 Apr 2015 19:12:14 +0000 (UTC) Received: (qmail 99433 invoked by uid 500); 13 Apr 2015 19:12:14 -0000 Delivered-To: apmail-activemq-dev-archive@activemq.apache.org Received: (qmail 99366 invoked by uid 500); 13 Apr 2015 19:12:14 -0000 Mailing-List: contact dev-help@activemq.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@activemq.apache.org Delivered-To: mailing list dev@activemq.apache.org Received: (qmail 99354 invoked by uid 99); 13 Apr 2015 19:12:13 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 13 Apr 2015 19:12:13 +0000 Date: Mon, 13 Apr 2015 19:12:13 +0000 (UTC) From: "Timothy Bish (JIRA)" To: dev@activemq.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (AMQ-5712) Broker can deadlock when using queues while producers wait on disk space MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/AMQ-5712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14492884#comment-14492884 ] Timothy Bish commented on AMQ-5712: ----------------------------------- PFC in this case doesn't matter as the problem occurs outside the normal checks and the add of the message into the cursor stalls while holding the message lock preventing any consumer from ever pulling a message off the Queue so basically the Queue becomes unusable until a broker restart. > Broker can deadlock when using queues while producers wait on disk space > ------------------------------------------------------------------------ > > Key: AMQ-5712 > URL: https://issues.apache.org/jira/browse/AMQ-5712 > Project: ActiveMQ > Issue Type: Bug > Components: Broker > Affects Versions: 5.11.1 > Reporter: Christopher L. Shannon > > I am experiencing a deadlock when using a Queue with non-persistent messages. The queue has a cursor high memory water mark set (right now at 70%). When a producer is producing messages quickly to the queue and that limit gets hit, the broker can deadlock. I have tried setting producerWindowSize and alwaysSyncSend which did not seem to help. When the broker hits that limit, I am unable to do things like purge the queue. Consumers can also deadlock as well. > Note that this appears to be the same issue as described in this ticket here: AMQ-2475 . The difference is that I am using a Queue and not a Topic and the fix for this appears to only have been for Topics. > The problem appears to be in the Queue class on line 1852 inside the {{cursorAdd}} method. The method being called is {{return messages.addMessageLast(msg);}} which will block indefinitely if there is no space available, which in turn ties up the {{messagesLock}} from being used by any other threads. We have seen a deadlock where consumers can't consume because they are waiting on this lock. It looks like in AMQ-2475 part of the fix was to replace {{messages.addMessageLast(msg)}} with {{messages.tryAddMessageLast(msg, 10)}}. I also noticed that not all of the message cursors support {{tryAddMessageLast}}, which could be a problem. {{FilePendingMessageCursor}} implements it but the rest of the cursors (notably {{StoreQueueCursor}}) simply delegate back to {{addMessageLast}} in the parent class. So part of this fix may require implementing {{tryAddMessageLast}} across more cursors. > Here is part of the thread dump showing the stuck producer: > {code} > "ActiveMQ Transport: ssl:///192.168.3.142:38589" daemon prio=10 tid=0x00007fb46c006000 nid=0x3b1a runnable [0x00007fb4b8a0d000] > java.lang.Thread.State: TIMED_WAITING (parking) > at sun.misc.Unsafe.park(Native Method) > - parking to wait for <0x00000000cfb13cd0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) > at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) > at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2176) > at org.apache.activemq.usage.Usage.waitForSpace(Usage.java:103) > at org.apache.activemq.usage.Usage.waitForSpace(Usage.java:90) > at org.apache.activemq.usage.Usage.waitForSpace(Usage.java:80) > at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.tryAddMessageLast(FilePendingMessageCursor.java:235) > - locked <0x00000000d2015ee0> (a org.apache.activemq.broker.region.cursors.FilePendingMessageCursor) > at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.addMessageLast(FilePendingMessageCursor.java:207) > - locked <0x00000000d2015ee0> (a org.apache.activemq.broker.region.cursors.FilePendingMessageCursor) > at org.apache.activemq.broker.region.cursors.StoreQueueCursor.addMessageLast(StoreQueueCursor.java:97) > - locked <0x00000000d1f20908> (a org.apache.activemq.broker.region.cursors.StoreQueueCursor) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)