activemq-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Bain <tb...@alumni.duke.edu>
Subject Re: How to avoid blocking of queue browsing after ActiveMQ checkpoint call
Date Mon, 11 Jan 2016 14:35:52 GMT
I believe you are correct: browsing a persistent queue uses bytes from the
memory store, because those bytes must be read from the persistence store
into the memory store before they can be handed off to browsers or
consumers.  If all available bytes in the memory store are already in use,
the messages can't be paged into the memory store, and so the operation
that required them to be paged in will hang/fail.

You can work around the problem by increasing your memory store size via
trial-and-error until the problem goes away.  Note that the broker itself
needs some amount of memory, so you can't give the whole heap over to the
memory store or you'll risk getting OOMs, which means you may need to
increase the heap size as well.  You can estimate how much memory the
broker needs aside from the memory store by subtracting the bytes used for
the memory store (539 MB) from the total heap bytes used as measured via
JConsole or similar tools.  I'd double (or more) that number to be safe, if
it was me; the last thing I want to deal with in a production application
(ActiveMQ or anything else) is running out of memory because I tried to cut
the memory limits too close just to save a little RAM.

All of that is how to work around the fact that before you try to browse
your queue, something else has already consumed all available bytes in the
memory store.  If you want to dig into why that's happening, we'd need to
try to figure out what those bytes are being used for and whether it's
possible to change configuration values to reduce the usage so it fits into
your current limit.  There will definitely be more effort required than
simply increasing the memory limit (and max heap size), but we can try if
you're not able to increase the limits enough to fix the problem.

If you want to go down that path, one thread to pull on is your observation
that you "can browse/consume some Queues  _until_ the #checkpoint call
after 30 seconds."  I assume from your reference to checkpointing that
you're using KahaDB as your persistence store.  Can you post the KahaDB
portion of your config?

Your statements here and in your StackOverflow post (
http://stackoverflow.com/questions/34679854/how-to-avoid-blocking-of-queue-browsing-after-activemq-checkpoint-call)
indicate that you think that the problem is that memory isn't getting
garbage collected after the operation that needed it (i.e. the checkpoint)
completes, but it's also possible that the checkpoint operation isn't
completing because it can't get enough messages read into the memory
store.  Have you confirmed via the thread dump that there is not a
checkpoint operation still in progress?  Also, how large are your journal
files that are getting checkpointed?  If they're large enough that all
messages for one file won't fit into the memory store, you might be able to
prevent the problem by using smaller files.

Tim
On Jan 8, 2016 9:32 AM, "Klaus Pittig" <klaus.pittig@futura4retail.com>
wrote:

> If I increase the JVM max heap size (4GB), the behavior does not change.
> In my point of view, the configured memoryLimit (500 MB) works as
> expected (heapdump shows same max. size for the TextMessage content,
> i.e. 55002 byte[] instances containing 539 MB total).
>
> However, trying to browse a queue shows no content, even if there is
> enough heap memory available.
>
> As far as i understand the sourcecode, this also due to the configured
> memoryLimit, because - i hope this is the answer you expect - the
> calculation for available causes hasSpace = false.
>
> I found this here:
>
> AbstractPendingMessageCursor {
> public boolean hasSpace() {
> return systemUsage != null ?
> (!systemUsage.getMemoryUsage().isFull(memoryUsageHighWaterMark)) : true;
> }
> public boolean isFull() {
> return systemUsage != null ? systemUsage.getMemoryUsage().isFull() :
> false;
> }
> }
>
>
> #hasSpace is in this case called during a click on a queue in the
> Webconsole; see the 2 stacks during this workflow:
>
> Daemon Thread [Queue:aaa114] (Suspended (breakpoint at line 107 in
> QueueStorePrefetch))
> owns: QueueStorePrefetch (id=6036)
> owns: StoreQueueCursor (id=6037)
> owns: Object (id=6038)
> QueueStorePrefetch.doFillBatch() line: 107
> QueueStorePrefetch(AbstractStoreCursor).fillBatch() line: 381
> QueueStorePrefetch(AbstractStoreCursor).reset() line: 142
> StoreQueueCursor.reset() line: 159
> Queue.doPageInForDispatch(boolean, boolean) line: 1897
> Queue.pageInMessages(boolean) line: 2119
> Queue.iterate() line: 1596
> DedicatedTaskRunner.runTask() line: 112
> DedicatedTaskRunner$1.run() line: 42
>
> Daemon Thread [ActiveMQ VMTransport: vm://localhost#1] (Suspended
> (breakpoint at line 107 in QueueStorePrefetch))
> owns: QueueStorePrefetch (id=5974)
> owns: StoreQueueCursor (id=5975)
> owns: Object (id=5976)
> owns: Object (id=5977)
> QueueStorePrefetch.doFillBatch() line: 107
> QueueStorePrefetch(AbstractStoreCursor).fillBatch() line: 381
> QueueStorePrefetch(AbstractStoreCursor).reset() line: 142
> StoreQueueCursor.reset() line: 159
> Queue.doPageInForDispatch(boolean, boolean) line: 1897
> Queue.pageInMessages(boolean) line: 2119
> Queue.iterate() line: 1596
> Queue.wakeup() line: 1822
> Queue.addSubscription(ConnectionContext, Subscription) line: 491
> ManagedQueueRegion(AbstractRegion).addConsumer(ConnectionContext,
> ConsumerInfo) line: 399
> ManagedRegionBroker(RegionBroker).addConsumer(ConnectionContext,
> ConsumerInfo) line: 427
> ManagedRegionBroker.addConsumer(ConnectionContext, ConsumerInfo) line:
> 244
> AdvisoryBroker(BrokerFilter).addConsumer(ConnectionContext,
> ConsumerInfo) line: 102
> AdvisoryBroker.addConsumer(ConnectionContext, ConsumerInfo) line: 104
> CompositeDestinationBroker(BrokerFilter).addConsumer(ConnectionContext,
> ConsumerInfo)
> line: 102
> TransactionBroker(BrokerFilter).addConsumer(ConnectionContext,
> ConsumerInfo) line: 102
> StatisticsBroker(BrokerFilter).addConsumer(ConnectionContext,
> ConsumerInfo) line: 102
> BrokerService$5(MutableBrokerFilter).addConsumer(ConnectionContext,
> ConsumerInfo) line: 107
> TransportConnection.processAddConsumer(ConsumerInfo) line: 663
> ConsumerInfo.visit(CommandVisitor) line: 348
> TransportConnection.service(Command) line: 334
> TransportConnection$1.onCommand(Object) line: 188
> ResponseCorrelator.onCommand(Object) line: 116
> MutexTransport.onCommand(Object) line: 50
> VMTransport.iterate() line: 248
> DedicatedTaskRunner.runTask() line: 112
> DedicatedTaskRunner$1.run() line: 42
>
>
>
> Setting queueBrowsePrefetch="1" and queuePrefetch="1" in the
> PolicyEntry for queue=">" also has no effect.
>
>
> Am 08.01.16 um 16:32 schrieb Tim Bain:
> > If you increase your JVM size (4GB, 8GB, etc., the biggest your OS and
> > hardware will support), does the behavior change?  Does it truly take all
> > available memory, or just all the memory that you've made available to it
> > (which isn't tiny but really isn't all that big)?
> >
> > Also, how do you know that the
> > MessageCursor seems to decide that there is not enough memory and stops
> > delivery of queue content to browsers/consumers?  What symptom tells you
> > that?
> > On Jan 8, 2016 8:25 AM, "Klaus Pittig" <klaus.pittig@futura4retail.com>
> > wrote:
> >
> >> (related issue: https://issues.apache.org/jira/browse/AMQ-6115)
> >>
> >> There's a problem when Using ActiveMQ with a large number of Persistence
> >> Queues (250) á 1000 persistent TextMessages á 10 KB.
> >>
> >> Our scenario requires these messages to remain in the storage over a
> >> long time (days), until they are consumed (large amounts of data are
> >> staged for distribution for many consumer, that could be offline for
> >> some days).
> >>
> >>
> >> After the Persistence Store is filled with these Messages and after a
> >> broker restart we can browse/consume some Queues  _until_ the
> >> #checkpoint call after 30 seconds.
> >>
> >> This call causes the broker to use all available memory and never
> >> releases it for other tasks such as Queue browse/consume. Internally the
> >> MessageCursor seems to decide, that there is not enough memory and stops
> >> delivery of queue content to browsers/consumers.
> >>
> >>
> >> => Is there a way to avoid this behaviour by configuration or is this a
> >> bug?
> >>
> >> The expectation is, that we can consume/browse any queue under all
> >> circumstances.
> >>
> >> Settings below are in production for some time now and several
> >> recommendations are applied found in the ActiveMQ documentation
> >> (destination policies, systemUsage, persistence store options etc.)
> >>
> >>  - Behaviour is tested with ActiveMQ: 5.11.2, 5.13.0 and 5.5.1.
> >>  - Memory Settings: Xmx=1024m
> >>  - Java: 1.8 or 1.7
> >>  - OS: Windows, MacOS, Linux
> >>  - PersistenceAdapter: KahaDB or LevelDB
> >>  - Disc: enough free space (200 GB) and physical memory (16 GB max).
> >>
> >> Besides the above mentioned settings we use the following settings for
> >> the broker (btw: changing the memoryLimit to a lower value like 1mb does
> >> not change the situation):
> >>
> >>     <destinationPolicy>
> >>         <policyMap>
> >>             <policyEntries>
> >>                 <policyEntry queue=">" producerFlowControl="false"
> >> optimizedDispatch="true" memoryLimit="128mb"
> >> timeBeforeDispatchStarts="1000">
> >>                     <dispatchPolicy>
> >>                         <strictOrderDispatchPolicy />
> >>                     </dispatchPolicy>
> >>                     <pendingQueuePolicy>
> >>                         <storeCursor />
> >>                     </pendingQueuePolicy>
> >>                 </policyEntry>
> >>             </policyEntries>
> >>         </policyMap>
> >>     </destinationPolicy>
> >>     <systemUsage>
> >>         <systemUsage sendFailIfNoSpace="true">
> >>             <memoryUsage>
> >>                 <memoryUsage limit="50 mb" />
> >>             </memoryUsage>
> >>             <storeUsage>
> >>                 <storeUsage limit="80000 mb" />
> >>             </storeUsage>
> >>             <tempUsage>
> >>                 <tempUsage limit="1000 mb" />
> >>             </tempUsage>
> >>         </systemUsage>
> >>     </systemUsage>
> >>
> >> If we set the **cursorMemoryHighWaterMark** in the destinationPolicy to
> >> a higher value like **150** or **600** depending on the difference
> >> between memoryUsage and the available heap space relieves the situation
> >> a bit for a workaround, but this is not really an option for production
> >> systems in my point of view.
> >>
> >> Screenie with information from Oracle Mission Control showing those
> >> ActiveMQTextMessage instances that are never released from memory:
> >>
> >> http://goo.gl/EjEixV
> >>
> >>
> >> Cheers
> >> Klaus
> >>
> >
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message