activemq-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Schmitt | Intratop <>
Subject Re: ActiveMQ 5.10.0 queue slowed down, restart helped
Date Tue, 17 Feb 2015 13:06:46 GMT

I work with Piotr on this issue. Let me try to provide some additional
information on our slow-down issue:

Storage is a PostgreSQL Server 9.3.2 on a Debian Wheezy / Kernel 3.2.51-1

We use JDBC and the PGPoolingDataSource

This is the persistenceAdapter configuration:
             <jdbcPersistenceAdapter dataDirectory="activemq-data" 
dataSource="#postgres-ds" lockKeepAlivePeriod="0"
createTablesOnStartup="false" />

We have 2 destination interceptors setup. And we run the demo code
(jetty-demo) because we have some applications using the http/rest 
interface it provides. We don't run camel.

Other than that it's a pretty mondane setup. And we also run two 
instances at the same time as a sort of fail-over. Because of the 
jdbc-backend, only one of them is active, and we use the failover 
protocol on clientside to use the active one. We use haproxy to serve 
the webinterface from the active instance. Both activemq-instances run 
on the same linux box, with different service ip-adresses. (they use the 
same binaries, only configuration and data directory are separated). The 
reason we run two instances is that we had big stability issues before, 
with the activemq process sort-of-hanging
itself up. We could move away from that setup, because with 5.10 this 
hasn't happened.

Like the database server, the linux box that runs the activemq instance 
is a Debian Wheezy Linux, but with Kernel 3.2.60-1+deb7u1.

Problem description: Once in a while we see 100% cpu load on the database.
We can isolate that to sql statements of the style:

MSGID_PROD='ID:tomcat10-XXX-41356-1422538681150-1:95156:1:1' AND 
MSGID_SEQ='1' AND  CONTAINER='queue://XXX_export'

These sql statements take more than 500ms. We've had scenarios where 
they took more than 3 seconds to complete. Queuesize for 500ms was ~1200 
messages for all queues (concentrated in one queue). With a production 
of about 2-3 Messages per seconds and a consumption of about 2 messages 
per second. Imho the queuesize and the query-time scales linearly.

We were able to "resolve" the issue by restarting both activemq 
instances. After that, the load on the database drops dramatically, 
instead of 100% cpu usage we see less than 10% on the database and a 
very fast recovery. The ActiveMQ-Processes look fine too.

My first quess was a missing database index, but they look fine. 
Besides, restarting the activemq instances resolves the issue .. which 
is very very weired for me .. I don't think it's a database lock either, 
because we couldn't see any and additionally, we see 100% cpu usage for 
the process executing the statement (postgres spawns a process per 
statement). That should imho (but I'm no database expect) not happen as 
well when there's a lock situation...

We're at a loss. Do you guys have an idea?

And one more thing: Once every two or three hours a lot of (several 
thousand) messages are created. But the above described problem is 
happening irregularly, every one or two weeks or so.

Best regards,

View raw message