activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Timothy Bish (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AMQ-5712) Broker can deadlock when using queues while producers wait on disk space
Date Mon, 13 Apr 2015 17:51:13 GMT

    [ https://issues.apache.org/jira/browse/AMQ-5712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14492733#comment-14492733
] 

Timothy Bish commented on AMQ-5712:
-----------------------------------

Thanks for the patch, I had uncovered much the same, in my testing and have a smaller but
similar test case using the AMQP test client.  The issue occurs due to a race when the temp
store is initialize by a change in memory usage that trips the limit causing the in memory
message to need to be sent to disk.  The problem is that in the Queue send we don't see that
temp storage is full yet and try to do the add.  While your fix does work around this is would
result in the lose of the message(s) that arrive while this is happening which is not something
we would want to do in the case of a Queue which has specific QOS guarantees.  

I am taking a look at how things work today and rethinking some of the layering of add vs
tryAdd as it seems a bit wrong to me and can lead to this sort of error in more than one case
as you have pointed out.  

> Broker can deadlock when using queues while producers wait on disk space
> ------------------------------------------------------------------------
>
>                 Key: AMQ-5712
>                 URL: https://issues.apache.org/jira/browse/AMQ-5712
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 5.11.1
>            Reporter: Christopher L. Shannon
>
> I am experiencing a deadlock when using a Queue with non-persistent messages.  The queue
has a cursor high memory water mark set (right now at 70%).  When a producer is producing
messages quickly to the queue and that limit gets hit, the broker can deadlock.   I have tried
setting producerWindowSize and alwaysSyncSend which did not seem to help. When the broker
hits that limit, I am unable to do things like purge the queue.  Consumers can also deadlock
as well. 
> Note that this appears to be the same issue as described in this ticket here: AMQ-2475
.  The difference is that I am using a Queue and not a Topic and the fix for this appears
to only have been for Topics.
> The problem appears to be in the Queue class on line 1852 inside the {{cursorAdd}} method.
 The method being called is {{return messages.addMessageLast(msg);}} which will block indefinitely
if there is no space available, which in turn ties up the {{messagesLock}} from being used
by any other threads.  We have seen a deadlock where consumers can't consume because they
are waiting on this lock.   It looks like in AMQ-2475 part of the fix was to replace {{messages.addMessageLast(msg)}}
with {{messages.tryAddMessageLast(msg, 10)}}.  I also noticed that not all of the message
cursors support {{tryAddMessageLast}}, which could be a problem.  {{FilePendingMessageCursor}}
implements it but the rest of the cursors (notably {{StoreQueueCursor}}) simply delegate back
to {{addMessageLast}} in the parent class.  So part of this fix may require implementing {{tryAddMessageLast}}
across more cursors.
> Here is part of the thread dump showing the stuck producer:
> {code}
> "ActiveMQ Transport: ssl:///192.168.3.142:38589" daemon prio=10 tid=0x00007fb46c006000
nid=0x3b1a runnable [0x00007fb4b8a0d000]
>    java.lang.Thread.State: TIMED_WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00000000cfb13cd0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2176)
>         at org.apache.activemq.usage.Usage.waitForSpace(Usage.java:103)
>         at org.apache.activemq.usage.Usage.waitForSpace(Usage.java:90)
>         at org.apache.activemq.usage.Usage.waitForSpace(Usage.java:80)
>         at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.tryAddMessageLast(FilePendingMessageCursor.java:235)
>         - locked <0x00000000d2015ee0> (a org.apache.activemq.broker.region.cursors.FilePendingMessageCursor)
>         at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.addMessageLast(FilePendingMessageCursor.java:207)
>         - locked <0x00000000d2015ee0> (a org.apache.activemq.broker.region.cursors.FilePendingMessageCursor)
>         at org.apache.activemq.broker.region.cursors.StoreQueueCursor.addMessageLast(StoreQueueCursor.java:97)
>         - locked <0x00000000d1f20908> (a org.apache.activemq.broker.region.cursors.StoreQueueCursor)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message