From dev-return-17469-apmail-activemq-dev-archive=activemq.apache.org@activemq.apache.org Tue Nov 03 09:52:23 2009 Return-Path: Delivered-To: apmail-activemq-dev-archive@www.apache.org Received: (qmail 71474 invoked from network); 3 Nov 2009 09:52:23 -0000 Received: from hermes.apache.org (HELO mail.apache.org) (140.211.11.3) by minotaur.apache.org with SMTP; 3 Nov 2009 09:52:23 -0000 Received: (qmail 22810 invoked by uid 500); 3 Nov 2009 09:52:23 -0000 Delivered-To: apmail-activemq-dev-archive@activemq.apache.org Received: (qmail 22741 invoked by uid 500); 3 Nov 2009 09:52:22 -0000 Mailing-List: contact dev-help@activemq.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@activemq.apache.org Delivered-To: mailing list dev@activemq.apache.org Received: (qmail 22731 invoked by uid 99); 3 Nov 2009 09:52:22 -0000 Received: from athena.apache.org (HELO athena.apache.org) (140.211.11.136) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Nov 2009 09:52:22 +0000 X-ASF-Spam-Status: No, hits=-2.6 required=5.0 tests=AWL,BAYES_00 X-Spam-Check-By: apache.org Received-SPF: pass (athena.apache.org: domain of gary.tully@gmail.com designates 209.85.218.215 as permitted sender) Received: from [209.85.218.215] (HELO mail-bw0-f215.google.com) (209.85.218.215) by apache.org (qpsmtpd/0.29) with ESMTP; Tue, 03 Nov 2009 09:52:19 +0000 Received: by bwz7 with SMTP id 7so7976953bwz.6 for ; Tue, 03 Nov 2009 01:51:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:in-reply-to:references :date:message-id:subject:from:to:content-type :content-transfer-encoding; bh=bIUMVTVV4XcbhHwxEEkbfLgtSy8L0sqQmWHuOrvScNk=; b=qPwo0Cfa+npJRfdECC7pAyCGjUZsC3iYBRTS7EUMI0F8XFtQ6RrtjbXXk4+3Y0FECe v93a4V7m6c19IMbs43jRpfGPRgaFJ+cWcwuX4qfWa6x5FulOC6A94cx+DMSUaIRcZgrl 6JwFFoCGOgUKuNtZCqI+boQ60xp9YGC5fjeS8= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :content-type:content-transfer-encoding; b=YNdiQ+gihQv8rDI3X4VVIk7Y5Qo0HbrBUQHmOdAQESPSAMr8FKOWFflmfE4qKA1nYn IMRXGeaflXhXgn7x5S4sAiFUyP0O8YL09l1uZHaOMRUtqyGAvklkfxRwirY2Nj7BpLzm xqjA00mk2WwlSjs7mD1cI46T584ogBQMWc/+I= MIME-Version: 1.0 Received: by 10.204.15.22 with SMTP id i22mr4820084bka.13.1257241917445; Tue, 03 Nov 2009 01:51:57 -0800 (PST) In-Reply-To: <4AEFDF9D.8040007@tid.es> References: <4AD84B1E.9090701@tid.es> <4AEFDF9D.8040007@tid.es> Date: Tue, 3 Nov 2009 09:51:55 +0000 Message-ID: <3a73c17c0911030151y1ef82d36u2ba5b0a3e48bd881@mail.gmail.com> Subject: Re: Broker get stuck with high amount of persistent messages for a given destination From: Gary Tully To: dev@activemq.apache.org Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable can you verify that you see the problem with the latest snapshot: https://repository.apache.org/content/repositories/snapshots/org/apache/act= ivemq/apache-activemq/5.4-SNAPSHOT/ 2009/11/3 Manuel Teira : > Hello again. > Let me insist on this problem, since it's hitting our production environm= ent > every other day. Anytime that, for whatever reason, one of the system que= ues > goes over ~60.000 pending persistent messages, we are not able to restart= it > properly, suffering the problems I've tried to detail in the former messa= ge. > Could you be so kind to take a look on it? If you need any further > information, please, make me know. > > Best regards. > > > Manuel Teira escribi=F3: >> >> Hello. >> >> We are embedding an activemq 4.2 broker in our application, running on a >> Sun JVM 5, and using an Oracle 10g database as persistent messaging stor= e >> (without journal, since we are using a JDBC master-slave cluster). >> In a pair of occasions, we were unable to get the broker started in our >> production environment. Analyzing the situation, it seems that the cause= was >> the great amount of persistent messages for a given queue (over 75.000), >> what was blocking the attempt to create consumers on that destination. >> >> Trying to reproduce the problem with a more up to date activemq version, >> we set up a 5.2.0 standalone broker to resemble our scenario, and a clie= nt >> that tries to create a set of consumers on the loaded queue. We exported= the >> table with messages from our production environment and filled the test >> scenario database with it. I'm attaching both the activemq configuration= and >> the client used for testing. >> >> What we found was: >> >> The broker started normally. Since no attempt to consume messages from t= he >> "loaded" destination was performed. >> Once the client started, we observed that: >> =A0- We tried to start 10 consumers on the queue. =A0Only one of those >> consumers got the session.createConsumer(), others were blocked (note th= at >> all of them are sharing the same connection) >> =A0- The only started consumer was unable to get any message, it was blo= cked >> in the getMessage() attempt. >> =A0- The broker seems to be trying to load all the messages from the >> database, showing the following =A0stack dump (just the involved threads= ): >> >> "ActiveMQ Transport: tcp:///127.0.0.1:33554" daemon prio=3D4 tid=3D0x00a= c2ef8 >> nid=3D0x3c waiting on condition [0xaae7e000..0xaae7fbf0] >> =A0 =A0 =A0 =A0at sun.misc.Unsafe.park(Native Method) >> =A0 =A0 =A0 =A0at >> java.util.concurrent.locks.LockSupport.park(LockSupport.java:118) >> =A0 =A0 =A0 =A0at >> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterr= upt(AbstractQueuedSynchronizer.java:681) >> =A0 =A0 =A0 =A0at >> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(Abst= ractQueuedSynchronizer.java:711) >> =A0 =A0 =A0 =A0at >> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQu= euedSynchronizer.java:1041) >> =A0 =A0 =A0 =A0at >> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.= java:184) >> =A0 =A0 =A0 =A0at >> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:256) >> =A0 =A0 =A0 =A0at >> org.apache.activemq.broker.region.Queue.addSubscription(Queue.java:217) >> =A0 =A0 =A0 =A0at >> org.apache.activemq.broker.region.AbstractRegion.addConsumer(AbstractReg= ion.java:275) >> =A0 =A0 =A0 =A0- locked <0xe35bbb40> (a java.lang.Object) >> =A0 =A0 =A0 =A0at >> org.apache.activemq.broker.region.RegionBroker.addConsumer(RegionBroker.= java:372) >> =A0 =A0 =A0 =A0at >> org.apache.activemq.broker.BrokerFilter.addConsumer(BrokerFilter.java:86= ) >> =A0 =A0 =A0 =A0at >> org.apache.activemq.broker.BrokerFilter.addConsumer(BrokerFilter.java:86= ) >> =A0 =A0 =A0 =A0at >> org.apache.activemq.advisory.AdvisoryBroker.addConsumer(AdvisoryBroker.j= ava:83) >> =A0 =A0 =A0 =A0at >> org.apache.activemq.broker.BrokerFilter.addConsumer(BrokerFilter.java:86= ) >> =A0 =A0 =A0 =A0at >> org.apache.activemq.broker.MutableBrokerFilter.addConsumer(MutableBroker= Filter.java:93) >> =A0 =A0 =A0 =A0at >> org.apache.activemq.broker.TransportConnection.processAddConsumer(Transp= ortConnection.java:541) >> =A0 =A0 =A0 =A0at >> org.apache.activemq.command.ConsumerInfo.visit(ConsumerInfo.java:345) >> =A0 =A0 =A0 =A0at >> org.apache.activemq.broker.TransportConnection.service(TransportConnecti= on.java:305) >> =A0 =A0 =A0 =A0at >> org.apache.activemq.broker.TransportConnection$1.onCommand(TransportConn= ection.java:179) >> =A0 =A0 =A0 =A0at >> org.apache.activemq.transport.TransportFilter.onCommand(TransportFilter.= java:68) >> =A0 =A0 =A0 =A0at >> org.apache.activemq.transport.WireFormatNegotiator.onCommand(WireFormatN= egotiator.java:143) >> =A0 =A0 =A0 =A0at >> org.apache.activemq.transport.InactivityMonitor.onCommand(InactivityMoni= tor.java:206) >> =A0 =A0 =A0 =A0- locked <0xb8a0d5d0> (a >> org.apache.activemq.transport.InactivityMonitor$1) >> =A0 =A0 =A0 =A0at >> org.apache.activemq.transport.TransportSupport.doConsume(TransportSuppor= t.java:84) >> =A0 =A0 =A0 =A0at >> org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:2= 03) >> =A0 =A0 =A0 =A0at >> org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:185= ) >> =A0 =A0 =A0 =A0at java.lang.Thread.run(Thread.java:595) >> >> "QueueThread:queue://TaskManagerQueue" daemon prio=3D10 tid=3D0x005e1bd8 >> nid=3D0x42 runnable [0xaac7e000..0xaac7faf0] >> =A0 =A0 =A0 =A0at oracle.jdbc.driver.T2CStatement.t2cDefineFetch(Native = Method) >> =A0 =A0 =A0 =A0at >> oracle.jdbc.driver.T2CPreparedStatement.doDefineFetch(T2CPreparedStateme= nt.java:827) >> =A0 =A0 =A0 =A0at >> oracle.jdbc.driver.T2CPreparedStatement.executeForRows(T2CPreparedStatem= ent.java:768) >> =A0 =A0 =A0 =A0at >> oracle.jdbc.driver.OracleStatement.executeMaybeDescribe(OracleStatement.= java:1062) >> =A0 =A0 =A0 =A0at >> oracle.jdbc.driver.OracleStatement.doExecuteWithTimeout(OracleStatement.= java:1132) >> =A0 =A0 =A0 =A0at >> oracle.jdbc.driver.OraclePreparedStatement.executeInternal(OraclePrepare= dStatement.java:3285) >> =A0 =A0 =A0 =A0at >> oracle.jdbc.driver.OraclePreparedStatement.executeQuery(OraclePreparedSt= atement.java:3329) >> =A0 =A0 =A0 =A0- locked <0xe37a09a0> (a oracle.jdbc.driver.T2CPreparedSt= atement) >> =A0 =A0 =A0 =A0- locked <0xe379c398> (a oracle.jdbc.driver.T2CConnection= ) >> =A0 =A0 =A0 =A0at >> org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(Delegat= ingPreparedStatement.java:91) >> =A0 =A0 =A0 =A0at >> org.apache.commons.dbcp.DelegatingPreparedStatement.executeQuery(Delegat= ingPreparedStatement.java:91) >> =A0 =A0 =A0 =A0at >> org.apache.activemq.store.jdbc.adapter.DefaultJDBCAdapter.doRecoverNextM= essages(DefaultJDBCAdapter.java:709) >> =A0 =A0 =A0 =A0at >> org.apache.activemq.store.jdbc.JDBCMessageStore.recoverNextMessages(JDBC= MessageStore.java:230) >> =A0 =A0 =A0 =A0at >> org.apache.activemq.store.ProxyMessageStore.recoverNextMessages(ProxyMes= sageStore.java:83) >> =A0 =A0 =A0 =A0at >> org.apache.activemq.broker.region.cursors.QueueStorePrefetch.doFillBatch= (QueueStorePrefetch.java:75) >> =A0 =A0 =A0 =A0at >> org.apache.activemq.broker.region.cursors.AbstractStoreCursor.fillBatch(= AbstractStoreCursor.java:227) >> =A0 =A0 =A0 =A0- locked <0xb8a12f30> (a >> org.apache.activemq.broker.region.cursors.QueueStorePrefetch) >> =A0 =A0 =A0 =A0at >> org.apache.activemq.broker.region.cursors.AbstractStoreCursor.reset(Abst= ractStoreCursor.java:100) >> =A0 =A0 =A0 =A0at >> org.apache.activemq.broker.region.cursors.StoreQueueCursor.reset(StoreQu= eueCursor.java:157) >> =A0 =A0 =A0 =A0- locked <0xb8a13648> (a >> org.apache.activemq.broker.region.cursors.StoreQueueCursor) >> =A0 =A0 =A0 =A0at >> org.apache.activemq.broker.region.Queue.doPageIn(Queue.java:1179) >> =A0 =A0 =A0 =A0- locked <0xb8a13648> (a >> org.apache.activemq.broker.region.cursors.StoreQueueCursor) >> =A0 =A0 =A0 =A0at >> org.apache.activemq.broker.region.Queue.pageInMessages(Queue.java:1308) >> =A0 =A0 =A0 =A0at org.apache.activemq.broker.region.Queue.iterate(Queue.= java:1011) >> =A0 =A0 =A0 =A0- locked <0xb8a13748> (a org.apache.activemq.broker.regio= n.Queue$2) >> =A0 =A0 =A0 =A0at >> org.apache.activemq.thread.DeterministicTaskRunner.runTask(Deterministic= TaskRunner.java:84) >> =A0 =A0 =A0 =A0at >> org.apache.activemq.thread.DeterministicTaskRunner$1.run(DeterministicTa= skRunner.java:41) >> =A0 =A0 =A0 =A0at >> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecuto= r.java:650) >> =A0 =A0 =A0 =A0at >> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.ja= va:675) >> =A0 =A0 =A0 =A0at java.lang.Thread.run(Thread.java:595) >> >> We waited for 9 or 10 hours without apparent changes: Only one consumer >> got, the other threads waiting, and no message consumed. >> In the production environment, the only way to recover the situation we >> found, was to move message from ACTIVEMQ_MSGS to another table, start th= e >> broker with ACTIVEMQ_MSGS empty, and create an application to deserializ= e >> messages from the table and inject them to the queue using the JMS API. >> >> We consider that over 10 hours of service unavailability is a serious >> problem. The problem seems to be related with the broker trying to get a= ll >> the messages from the table at once whenever a consumer for a given >> destination is created. Is this a known problem? Is there any way to imp= rove >> the situation? >> >> Best regards. >> >> >> -- >> Manuel. >> >> > > --=20 http://blog.garytully.com Open Source Integration http://fusesource.com