activemq-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Tim Bain <tb...@alumni.duke.edu>
Subject Re: ActiveMQ embedded broker 100% cpu usage
Date Sun, 11 Jan 2015 13:53:00 GMT
If you haven't configured the broker to write a GC log, you can find out if
you're doing full GCs by either attaching JConsole and looking at the
Memory tab or by using the jstat -gc command.  Both of those are only
options if you haven't killed the process, so they probably won't help you
now.

For what it's worth, I don't think Kevin's right about this being caused by
continual full GCs.  It's quite possible that a single full GC was the root
cause of the disconnection that starts it all, but if the broker was doing
continual full GCs like Kevin guessed, I don't think it would have
successfully reconnected nor that you'd have seen all the log lines about
duplicate messages.  I think it's more likely that there's another bug
related to how master/slave pairs handle network disconnection and
reconnection.

The log lines you posted look like they're from either the master or the
slave that gets disconnected; which is it, and is there anything useful in
the other's log?
On Jan 10, 2015 3:35 PM, "Kevin Burton" <burton@spinn3r.com> wrote:

> $100 says it’s spending all its time doing full GC. Look at your GC log..
> I bet it will keep printing “Full GC” constantly.
>
> Kevin
>
> On Sat, Jan 10, 2015 at 7:19 AM, Zoltán Nyári <nyari.zoltan@smartfront.hu>
> wrote:
>
> > Hi,
> >
> > We have an ActiveMQ multibroker network with 4 nodes. There is 1 master
> > node and 3 slave nodes,
> > every message is broadcasted between all nodes using persistent topics
> and
> > durable subscriptions.
> > ActiveMQ (version 5.9.1) is embedded into a Spring based Java Web
> > Application running on Tomcat 7.
> > It works well mostly, but today we had to face a serious issue for the
> > second time.
> >
> > The connection between the master node and a slave node is lost:
> >
> > WARN   2015.01.10 14:00:29,281
> > org.apache.activemq.network.DemandForwardingBridgeSupport - Caught an
> > exception processing local command
> > java.net.SocketException: Connection reset
> >         at
> > java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:118)
> >         at java.net.SocketOutputStream.write(SocketOutputStream.java:159)
> >         at
> >
> >
> org.apache.activemq.transport.tcp.TcpBufferedOutputStream.flush(TcpBufferedOutputStream.java:115)
> >         at java.io.DataOutputStream.flush(DataOutputStream.java:123)
> >         at
> >
> >
> org.apache.activemq.transport.tcp.TcpTransport.oneway(TcpTransport.java:176)
> >         at
> >
> >
> org.apache.activemq.transport.AbstractInactivityMonitor.doOnewaySend(AbstractInactivityMonitor.java:304)
> >         at
> >
> >
> org.apache.activemq.transport.AbstractInactivityMonitor.oneway(AbstractInactivityMonitor.java:286)
> >         at
> >
> >
> org.apache.activemq.transport.TransportFilter.oneway(TransportFilter.java:85)
> >         at
> >
> >
> org.apache.activemq.transport.WireFormatNegotiator.oneway(WireFormatNegotiator.java:104)
> >         at
> >
> org.apache.activemq.transport.MutexTransport.oneway(MutexTransport.java:68)
> >         at
> >
> >
> org.apache.activemq.transport.ResponseCorrelator.asyncRequest(ResponseCorrelator.java:81)
> >         at
> >
> >
> org.apache.activemq.network.DemandForwardingBridgeSupport.serviceLocalCommand(DemandForwardingBridgeSupport.java:968)
> >         at
> >
> >
> org.apache.activemq.network.DemandForwardingBridgeSupport$2.onCommand(DemandForwardingBridgeSupport.java:177)
> >         at
> >
> >
> org.apache.activemq.transport.ResponseCorrelator.onCommand(ResponseCorrelator.java:116)
> >         at
> >
> >
> org.apache.activemq.transport.MutexTransport.onCommand(MutexTransport.java:50)
> >         at
> >
> org.apache.activemq.transport.vm.VMTransport.iterate(VMTransport.java:247)
> >         at
> >
> >
> org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:133)
> >         at
> >
> org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:48)
> >         at
> >
> >
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> >         at
> >
> >
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> >         at java.lang.Thread.run(Thread.java:745)
> > INFO   2015.01.10 14:00:29,335
> > org.apache.activemq.network.DemandForwardingBridgeSupport - Network
> > connection between vm://PERCMASTERBroker#30 and tcp:///10.10.0.58:40819
> > @62000
> > shutdown due to a local error: java.io.EOFException
> > INFO   2015.01.10 14:00:29,346
> > org.apache.activemq.network.DemandForwardingBridgeSupport - Network
> > connection between vm://PERCMASTERBroker#30 and tcp:///10.10.0.58:40819
> > @62000
> > shutdown due to a local error: java.net.SocketException: Connection reset
> > INFO   2015.01.10 14:00:29,346
> > org.apache.activemq.network.DemandForwardingBridgeSupport - Network
> > connection between vm://PERCMASTERBroker#30 and tcp:///10.10.0.58:40819
> > @62000
> > shutdown due to a local error: java.io.EOFException
> > INFO   2015.01.10 14:00:29,346
> > org.apache.activemq.network.DemandForwardingBridgeSupport - Network
> > connection between vm://PERCMASTERBroker#30 and tcp:///10.10.0.58:40819
> > @62000
> > shutdown due to a local error: java.io.EOFException
> > INFO   2015.01.10 14:00:29,346
> > org.apache.activemq.network.DemandForwardingBridgeSupport - Network
> > connection between vm://PERCMASTERBroker#30 and tcp:///10.10.0.58:40819
> > @62000
> > shutdown due to a local error: java.io.EOFException
> > INFO   2015.01.10 14:00:29,347
> > org.apache.activemq.network.DemandForwardingBridgeSupport - Network
> > connection between vm://PERCMASTERBroker#30 and tcp:///10.10.0.58:40819
> > @62000
> > shutdown due to a local error: java.io.EOFException
> > WARN   2015.01.10 14:00:29,346
> > org.apache.activemq.network.DemandForwardingBridgeSupport - Caught an
> > exception processing local command
> > java.io.EOFException
> >         at java.io.DataInputStream.readFully(DataInputStream.java:197)
> >         at java.io.DataInputStream.readFully(DataInputStream.java:169)
> >         at
> >
> >
> org.apache.activemq.openwire.v10.BaseDataStreamMarshaller.tightUnmarshalByteSequence(BaseDataStreamMarshaller.java:440)
> >         at
> >
> >
> org.apache.activemq.openwire.v10.MessageMarshaller.tightUnmarshal(MessageMarshaller.java:72)
> >         at
> >
> >
> org.apache.activemq.openwire.v10.ActiveMQMessageMarshaller.tightUnmarshal(ActiveMQMessageMarshaller.java:66)
> >         at
> >
> >
> org.apache.activemq.openwire.v10.ActiveMQTextMessageMarshaller.tightUnmarshal(ActiveMQTextMessageMarshaller.java:66)
> >         at
> >
> >
> org.apache.activemq.openwire.OpenWireFormat.doUnmarshal(OpenWireFormat.java:371)
> >         at
> >
> >
> org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:285)
> >         at
> >
> >
> org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:221)
> >         at
> >
> org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:213)
> >         at
> > org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:196)
> >         at java.lang.Thread.run(Thread.java:745)
> > INFO   2015.01.10 14:00:29,347
> > org.apache.activemq.network.DemandForwardingBridgeSupport - Network
> > connection between vm://PERCMASTERBroker#30 and tcp:///10.10.0.58:40819
> > @62000
> > shutdown due to a local error: java.io.EOFException
> > INFO   2015.01.10 14:00:29,383
> > org.apache.activemq.network.DemandForwardingBridgeSupport -
> > PERCMASTERBroker bridge to BALASSIBroker stopped
> >
> >
> > After a few seconds the connection has been reestablished automatically:
> >
> > INFO   2015.01.10 14:00:45,563
> > org.apache.activemq.broker.TransportConnection - Started responder end of
> > duplex bridge clientConnector@ID:balaserver1-45698-1418680167169-1:1
> > INFO   2015.01.10 14:00:45,570
> > org.apache.activemq.network.DemandForwardingBridgeSupport - Network
> > connection between vm://PERCMASTERBroker#42 and tcp:///10.10.0.58:54625
> > @62000
> > (BALASSIBroker) has been established.
> >
> > But after then the cpu usage goes up to 100% (effectively killed the
> > application) and the log full of duplicate message errors:
> >
> > WARN   2015.01.10 14:00:46,599
> > org.apache.activemq.broker.region.cursors.AbstractStoreCursor -
> > org.apache.activemq.broker.region.cursors.QueueStorePrefetch@4ed7430a
> >
> :Consumer.PERCMASTER.VirtualTopic.SFOperationShare,batchResetNeeded=false,storeHasMessages=true,size=0,cacheEnabled=true,maxBatchSize:1,hasSpace:true
> > - cursor got duplicate:
> ID:balaserver1-45698-1418680167169-6:1:2:1:36636, 4
> > WARN   2015.01.10 14:00:46,639 org.apache.activemq.broker.region.Queue -
> > duplicate message from store
> > ID:balaserver1-45698-1418680167169-6:1:2:1:36636, redirecting for dlq
> > processing
> > WARN   2015.01.10 14:00:48,526
> > org.apache.activemq.broker.region.cursors.AbstractStoreCursor -
> > org.apache.activemq.broker.region.cursors.QueueStorePrefetch@4ed7430a
> >
> :Consumer.PERCMASTER.VirtualTopic.SFOperationShare,batchResetNeeded=false,storeHasMessages=true,size=1,cacheEnabled=true,maxBatchSize:2,hasSpace:true
> > - cursor got duplicate:
> ID:balaserver1-45698-1418680167169-6:1:2:1:36637, 4
> > WARN   2015.01.10 14:00:48,528 org.apache.activemq.broker.region.Queue -
> > duplicate message from store
> > ID:balaserver1-45698-1418680167169-6:1:2:1:36637, redirecting for dlq
> > processing
> > WARN   2015.01.10 14:00:48,546
> > org.apache.activemq.broker.region.cursors.AbstractStoreCursor -
> > org.apache.activemq.broker.region.cursors.QueueStorePrefetch@4ed7430a
> >
> :Consumer.PERCMASTER.VirtualTopic.SFOperationShare,batchResetNeeded=false,storeHasMessages=true,size=1,cacheEnabled=true,maxBatchSize:3,hasSpace:true
> > - cursor got duplicate:
> ID:balaserver1-45698-1418680167169-6:1:3:1:37026, 4
> > WARN   2015.01.10 14:00:48,547 org.apache.activemq.broker.region.Queue -
> > duplicate message from store
> > ID:balaserver1-45698-1418680167169-6:1:3:1:37026, redirecting for dlq
> > processing
> > WARN   2015.01.10 14:00:48,560
> > org.apache.activemq.broker.region.cursors.AbstractStoreCursor -
> > org.apache.activemq.broker.region.cursors.QueueStorePrefetch@4ed7430a
> >
> :Consumer.PERCMASTER.VirtualTopic.SFOperationShare,batchResetNeeded=false,storeHasMessages=true,size=1,cacheEnabled=true,maxBatchSize:2,hasSpace:true
> > - cursor got duplicate:
> ID:balaserver1-45698-1418680167169-6:1:2:1:36638, 4
> >
> > The only thing we could do is stop the application (on all nodes), clear
> > the activemq-data folder and restart the application.
> > When we simply tried to restart it, the startup process has stucked with
> no
> > error message, just using 100% of the CPU.
> >
> > Do you have any idea what could have happened here?
> >
> > Thanks,
> >
> > Zoltan Nyari
> >
>
>
>
> --
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> <https://plus.google.com/102718274791889610666/posts>
> <http://spinn3r.com>
>

Mime
  • Unnamed multipart/alternative (inline, None, 0 bytes)
View raw message