activemq-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "art.licis" <>
Subject ActiveMQ 5.11.0: Out Of Memory, caused by PageFile object tree filling up heap
Date Thu, 19 Jul 2018 14:13:21 GMT
Dear community,

We've been evaluation AMQ which is based on version 5.11.0. Several tests in
staging environments which would have a big number of producers and
consumers ended up with OutOfMemory: GC Overhead Limit exceeded exception.
However, it looks like it's just one slow consumer that is causing this
memory problem (details below).

Initially, we tried running AMQ with 4Gb of max heap, later with 8Gb but it
had the same outcome.

Configuration: we have intentionally producer flow control turned off, and
expecting temp store to fill up before the producer is blocked
(non-persistent topic messages). I can simulate this with a simple and
controlled tests involving very fast producer, and several fast and some
slow consumers. With manual testing we achieve the desired result: when temp
store reaches the limit, the producer is blocked.

We analyzed GC outputs and heap dump which we got upon the crash. Time spent
in GC keeps significantly increasing before the crash, and heap dump shows
the majority of memory occupied by PageFile object tree (lots of
PageFile$PageWrite objects, which indicate that too much data remains in
memory uncommitted to temp store; also related DataFileAppender$WriteBatch).
PageFile is the correct one, the one used in PListStoreImpl (i.e. temp store
for non-persistent messages).

During my analysis I also discovered some bug that was fixed recently -
*AMQ-6815*. It looks like it affects the exact class family that could have
caused the memory leak. 

I'd be grateful for any help or suggestions. I'm currently preparing the
bigger test environment with many producers/consumers (and I'm also gonna
run Activemq in docker to limit I/O speed while I'm sure it wasn't the
issue), but the fact is that out of potentially thousands of
producers/consumers there was only one slow consumer. Eventually, we will
try running whole system against latest ActiveMQ, however, it will take lots
of arrangements and I'd like to see if I can make reproducable and
controlled test environment, and if there're some pitfalls in current setup.

activemq.xml is below (the slow consumer was not the one listening to
virtual topic). And I will post more info if I'm able to solve the issue or
at least get more hints.

Thank you.

        <property name="properties">
            <bean class=""/>
    <broker xmlns=""
brokerName="${broker-name}" dataDirectory="${data}" start="false"
                <policyEntry topic=">" producerFlowControl="false">
                <policyEntry queue=">" producerFlowControl="false"
optimizedDispatch="true" >
            <managementContext createConnector="true" connectorPort="9001"/>
            <defaultIOExceptionHandler ignoreNoSpaceErrors="false"/>
            <kahaDB directory="${data}/kahadb"/>
                    <memoryUsage limit="300 mb"/>
                    <storeUsage limit="100 gb"/>
                    <tempUsage limit="10 gb"/>
            <transportConnector name="openwire"
                    <compositeQueue name="rpc.request.>"
physicalName="rpc.request.virtual.notifications" />
                    <compositeQueue name="rpc.response.>"
physicalName="rpc.response.virtual.notifications" />

Sent from:

View raw message