Return-Path: X-Original-To: apmail-activemq-dev-archive@www.apache.org Delivered-To: apmail-activemq-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id 7646117B33 for ; Tue, 7 Apr 2015 18:32:16 +0000 (UTC) Received: (qmail 74416 invoked by uid 500); 7 Apr 2015 18:32:13 -0000 Delivered-To: apmail-activemq-dev-archive@activemq.apache.org Received: (qmail 74362 invoked by uid 500); 7 Apr 2015 18:32:13 -0000 Mailing-List: contact dev-help@activemq.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@activemq.apache.org Delivered-To: mailing list dev@activemq.apache.org Received: (qmail 74187 invoked by uid 99); 7 Apr 2015 18:32:13 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 07 Apr 2015 18:32:13 +0000 Date: Tue, 7 Apr 2015 18:32:12 +0000 (UTC) From: "Christopher L. Shannon (JIRA)" To: dev@activemq.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Created] (AMQ-5712) Broker can deadlock for queues while producers wait on disk space MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 Christopher L. Shannon created AMQ-5712: ------------------------------------------- Summary: Broker can deadlock for queues while producers wait on disk space Key: AMQ-5712 URL: https://issues.apache.org/jira/browse/AMQ-5712 Project: ActiveMQ Issue Type: Bug Components: Broker Affects Versions: 5.11.1 Reporter: Christopher L. Shannon I am experiencing a deadlock when using a Queue with non-persistent messages. The queue has a cursor high memory water mark set (right now at 70%). When a producer is producing messages quickly to the queue and that limit gets hit, the broker can deadlock. I have tried setting producerWindowSize and alwaysSyncSend which did not seem to help. When the broker hits that limit, I am unable to do things like purge the queue. Consumers can also deadlock as well. Note that this appears to be the same issue as described in this ticket here: AMQ-2475 . The difference is that I am using a Queue and not a Topic and the fix for this appears to only have been for Topics. The problem appears to be in the Queue class on line 1852 inside the {{cursorAdd}} method. The method being called is {{return messages.addMessageLast(msg);}} which will block indefinitely if there is no space available, which in turn ties up the {{messagesLock}} from being used by any other threads. We have seen a deadlock where consumers can't consume because they are waiting on this lock. It looks like in AMQ-2475 part of the fix was to replace {{messages.addMessageLast(msg)}} with {{messages.tryAddMessageLast(msg, 10)}}. I also noticed that not all of the message cursors support {{tryAddMessageLast}}, which could be a problem. {{FilePendingMessageCursor}} implements it but the rest of the cursors (notably {{StoreQueueCursor}}) simply delegate back to {{addMessageLast}} in the parent class. So part of this fix may require implementing {{tryAddMessageLast}} across more cursors. Here is part of the thread dump showing the stuck producer: {code} "ActiveMQ Transport: ssl:///192.168.3.142:38589" daemon prio=10 tid=0x00007fb46c006000 nid=0x3b1a runnable [0x00007fb4b8a0d000] java.lang.Thread.State: TIMED_WAITING (parking) at sun.misc.Unsafe.park(Native Method) - parking to wait for <0x00000000cfb13cd0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject) at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226) at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2176) at org.apache.activemq.usage.Usage.waitForSpace(Usage.java:103) at org.apache.activemq.usage.Usage.waitForSpace(Usage.java:90) at org.apache.activemq.usage.Usage.waitForSpace(Usage.java:80) at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.tryAddMessageLast(FilePendingMessageCursor.java:235) - locked <0x00000000d2015ee0> (a org.apache.activemq.broker.region.cursors.FilePendingMessageCursor) at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.addMessageLast(FilePendingMessageCursor.java:207) - locked <0x00000000d2015ee0> (a org.apache.activemq.broker.region.cursors.FilePendingMessageCursor) at org.apache.activemq.broker.region.cursors.StoreQueueCursor.addMessageLast(StoreQueueCursor.java:97) - locked <0x00000000d1f20908> (a org.apache.activemq.broker.region.cursors.StoreQueueCursor) at mw.activemq.plugins.broker.adapter.PendingMessageCursorSupport.addMessageLast(PendingMessageCursorSupport.java:66) at org.apache.activemq.broker.region.Queue.cursorAdd(Queue.java:1852) at org.apache.activemq.broker.region.Queue.orderedCursorAdd(Queue.java:926) at org.apache.activemq.broker.region.Queue.doMessageSend(Queue.java:902) at org.apache.activemq.broker.region.Queue.send(Queue.java:781) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)