activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Christopher L. Shannon (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AMQ-5712) Broker can deadlock when using queues while producers wait on disk space
Date Thu, 09 Apr 2015 22:15:12 GMT

    [ https://issues.apache.org/jira/browse/AMQ-5712?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14488392#comment-14488392
] 

Christopher L. Shannon commented on AMQ-5712:
---------------------------------------------

I started working on a unit test today to replicate this issue but I haven't quite gotten
it to work in a unit test yet.  In my simple unit test (just one producer), when the temp
storage fills the broker detects this at line 821 of {{Queue.java}} when {{checkUsage(context,
producerExchange, message);}} is called.

When I run our real broker, it gets past that line until it hits the {{cursorAdd}} method
and ultimately gets stuck spinning in {{FilePendingMessageCursor.java}} on line 235 at {{if
(systemUsage.getTempUsage().waitForSpace(maxWaitTime))}}

Our broker is configured for persistent messaging, but the producer is sending non-persistent
messages which is why the temp store is used.  I'll continue to try and put a test together
next week or figure out what's different about my set up.  We have some custom code that we
are using so it could be that or it could just be our configuration. (Maybe something in the
Queue Policy that we have configured)  What I do know is that when I apply my pull request
it fixes the dead lock and the broker no longer gets stuck on line 235 of {{FilePendingMessageCursor.java}}

> Broker can deadlock when using queues while producers wait on disk space
> ------------------------------------------------------------------------
>
>                 Key: AMQ-5712
>                 URL: https://issues.apache.org/jira/browse/AMQ-5712
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 5.11.1
>            Reporter: Christopher L. Shannon
>
> I am experiencing a deadlock when using a Queue with non-persistent messages.  The queue
has a cursor high memory water mark set (right now at 70%).  When a producer is producing
messages quickly to the queue and that limit gets hit, the broker can deadlock.   I have tried
setting producerWindowSize and alwaysSyncSend which did not seem to help. When the broker
hits that limit, I am unable to do things like purge the queue.  Consumers can also deadlock
as well. 
> Note that this appears to be the same issue as described in this ticket here: AMQ-2475
.  The difference is that I am using a Queue and not a Topic and the fix for this appears
to only have been for Topics.
> The problem appears to be in the Queue class on line 1852 inside the {{cursorAdd}} method.
 The method being called is {{return messages.addMessageLast(msg);}} which will block indefinitely
if there is no space available, which in turn ties up the {{messagesLock}} from being used
by any other threads.  We have seen a deadlock where consumers can't consume because they
are waiting on this lock.   It looks like in AMQ-2475 part of the fix was to replace {{messages.addMessageLast(msg)}}
with {{messages.tryAddMessageLast(msg, 10)}}.  I also noticed that not all of the message
cursors support {{tryAddMessageLast}}, which could be a problem.  {{FilePendingMessageCursor}}
implements it but the rest of the cursors (notably {{StoreQueueCursor}}) simply delegate back
to {{addMessageLast}} in the parent class.  So part of this fix may require implementing {{tryAddMessageLast}}
across more cursors.
> Here is part of the thread dump showing the stuck producer:
> {code}
> "ActiveMQ Transport: ssl:///192.168.3.142:38589" daemon prio=10 tid=0x00007fb46c006000
nid=0x3b1a runnable [0x00007fb4b8a0d000]
>    java.lang.Thread.State: TIMED_WAITING (parking)
>         at sun.misc.Unsafe.park(Native Method)
>         - parking to wait for  <0x00000000cfb13cd0> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
>         at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>         at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2176)
>         at org.apache.activemq.usage.Usage.waitForSpace(Usage.java:103)
>         at org.apache.activemq.usage.Usage.waitForSpace(Usage.java:90)
>         at org.apache.activemq.usage.Usage.waitForSpace(Usage.java:80)
>         at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.tryAddMessageLast(FilePendingMessageCursor.java:235)
>         - locked <0x00000000d2015ee0> (a org.apache.activemq.broker.region.cursors.FilePendingMessageCursor)
>         at org.apache.activemq.broker.region.cursors.FilePendingMessageCursor.addMessageLast(FilePendingMessageCursor.java:207)
>         - locked <0x00000000d2015ee0> (a org.apache.activemq.broker.region.cursors.FilePendingMessageCursor)
>         at org.apache.activemq.broker.region.cursors.StoreQueueCursor.addMessageLast(StoreQueueCursor.java:97)
>         - locked <0x00000000d1f20908> (a org.apache.activemq.broker.region.cursors.StoreQueueCursor)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Mime
View raw message