activemq-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "James Green (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (AMQ-4141) Semi-stuck queue until server restart
Date Mon, 29 Oct 2012 16:50:12 GMT

    [ https://issues.apache.org/jira/browse/AMQ-4141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13486138#comment-13486138
] 

James Green commented on AMQ-4141:
----------------------------------

One of the dispatch servers is now queueing messages again without dequeueing to the hub.
A grep of the debug logs on the hub shows that bridging information for the hub's account_events
consumer is going to two of the production machines but not to the one where messages are
spooling. This is confirmed by switching off the consumer: on the "working" production servers
the consumer count for account_events drops from one to zero, but on the third production
server the consumer count remains at one (now incorrect).

Log on the incorrect production server shows (zorin is the faulty server, billing is an alias
for our hub):

2012-10-29 16:09:22,611 | WARN  | Caught an exception processing local command | org.apache.activemq.network.DemandForwardingBridgeSupport
| ActiveMQ BrokerService[zorin] Task-9560
java.net.SocketException: Connection reset
        at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:113)
        at java.net.SocketOutputStream.write(SocketOutputStream.java:153)
        at sun.security.ssl.OutputRecord.writeBuffer(OutputRecord.java:314)
        at sun.security.ssl.OutputRecord.write(OutputRecord.java:303)
        at sun.security.ssl.SSLSocketImpl.writeRecordInternal(SSLSocketImpl.java:763)
        at sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:751)
        at sun.security.ssl.AppOutputStream.write(AppOutputStream.java:78)
        at org.apache.activemq.transport.tcp.TcpBufferedOutputStream.flush(TcpBufferedOutputStream.java:115)
        at java.io.DataOutputStream.flush(DataOutputStream.java:123)
        at org.apache.activemq.transport.tcp.TcpTransport.oneway(TcpTransport.java:184)
        at org.apache.activemq.transport.AbstractInactivityMonitor.doOnewaySend(AbstractInactivityMonitor.java:289)
        at org.apache.activemq.transport.AbstractInactivityMonitor.oneway(AbstractInactivityMonitor.java:271)
        at org.apache.activemq.transport.TransportFilter.oneway(TransportFilter.java:85)
        at org.apache.activemq.transport.WireFormatNegotiator.oneway(WireFormatNegotiator.java:104)
        at org.apache.activemq.transport.MutexTransport.oneway(MutexTransport.java:68)
        at org.apache.activemq.transport.ResponseCorrelator.oneway(ResponseCorrelator.java:60)
        at org.apache.activemq.network.DemandForwardingBridgeSupport.serviceLocalCommand(DemandForwardingBridgeSupport.java:726)
        at org.apache.activemq.network.DemandForwardingBridgeSupport$1.onCommand(DemandForwardingBridgeSupport.java:167)
        at org.apache.activemq.transport.ResponseCorrelator.onCommand(ResponseCorrelator.java:116)
        at org.apache.activemq.transport.MutexTransport.onCommand(MutexTransport.java:50)
        at org.apache.activemq.transport.vm.VMTransport.doDispatch(VMTransport.java:138)
        at org.apache.activemq.transport.vm.VMTransport.dispatch(VMTransport.java:127)
        at org.apache.activemq.transport.vm.VMTransport.oneway(VMTransport.java:104)
        at org.apache.activemq.transport.MutexTransport.oneway(MutexTransport.java:68)
        at org.apache.activemq.transport.ResponseCorrelator.oneway(ResponseCorrelator.java:60)
        at org.apache.activemq.broker.TransportConnection.dispatch(TransportConnection.java:1312)
        at org.apache.activemq.broker.TransportConnection.processDispatch(TransportConnection.java:838)
        at org.apache.activemq.broker.TransportConnection.iterate(TransportConnection.java:873)
        at org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:129)
        at org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:47)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:636)
2012-10-29 16:09:22,611 | INFO  | Network connection between vm://zorin#132 and ssl://hub/x.x.x.x:61617
shutdown due to a local error: java.net.SocketException: Connection reset | org.apache.activemq.network.DemandForwardingBridgeSupport
| ActiveMQ BrokerService[zorin] Task-9560

Then eventually:
2012-10-29 16:10:23,764 | INFO  | Establishing network connection from vm://zorin?async=false&network=true
to ssl://hub:61617 | org.apache.activemq.network.DiscoveryNetworkConnector | ActiveMQ Task-31
2012-10-29 16:10:23,765 | INFO  | Connector vm://zorin Started | org.apache.activemq.broker.TransportConnector
| ActiveMQ Task-31
2012-10-29 16:10:24,002 | INFO  | Network connection between vm://zorin#146 and ssl://hub/x.x.x.x:61617(hub)
has been established. | org.apache.activemq.network.DemandForwardingBridgeSupport | StartLocalBridge:
localBroker=vm://zorin#146

Then, further down:
2012-10-29 16:15:18,417 | WARN  | Transport Connection to: tcp://127.0.0.1:44832 failed: java.net.SocketException:
Broken pipe | org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ Transport:
tcp:///127.0.0.1:44832@61612
2012-10-29 16:15:33,982 | INFO  | Network connection between vm://zorin#146 and ssl://hub/x.x.x.x:61617
shutdown due to a local error: org.apache.activemq.transport.InactivityIOException: Channel
was inactive for too (>30000) long: tcp://x.x.x.x:61617 | org.apache.activemq.network.DemandForwardingBridgeSupport
| ActiveMQ InactivityMonitor Worker
2012-10-29 16:15:33,983 | WARN  | failed to deliver remove command for local subscription,
for remote ID:billing-52716-1351526992068-2:12:-1:1 | org.apache.activemq.network.DemandForwardingBridgeSupport
| ActiveMQ BrokerService[zorin] Task-9720
org.apache.activemq.transport.TransportDisposedIOException: Peer (vm://zorin#147) disposed.
        at org.apache.activemq.transport.vm.VMTransport.oneway(VMTransport.java:89)
        at org.apache.activemq.transport.MutexTransport.oneway(MutexTransport.java:68)
        at org.apache.activemq.transport.ResponseCorrelator.oneway(ResponseCorrelator.java:60)
        at org.apache.activemq.network.DemandForwardingBridgeSupport$8.run(DemandForwardingBridgeSupport.java:666)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
        at java.lang.Thread.run(Thread.java:636)
2012-10-29 16:15:33,983 | INFO  | Establishing network connection from vm://zorin?async=false&network=true
to ssl://hub:61617 | org.apache.activemq.network.DiscoveryNetworkConnector | ActiveMQ Task-33
2012-10-29 16:15:34,213 | INFO  | Network connection between vm://zorin#148 and ssl://hub/x.x.x.x:61617(hub)
has been established. | org.apache.activemq.network.DemandForwardingBridgeSupport | StartLocalBridge:
localBroker=vm://zorin#148
2012-10-29 16:15:35,983 | INFO  | zorin bridge to hub stopped | org.apache.activemq.network.DemandForwardingBridgeSupport
| ActiveMQ BrokerService[zorin] Task-9739
2012-10-29 16:18:44,191 | WARN  | Network connection between vm://zorin#148 and ssl://hub/x.x.x.x:61617
shutdown due to a remote error: org.apache.activemq.transport.InactivityIOException: Channel
was inactive for too (>30000) long: tcp://x.x.x.x:61617 | org.apache.activemq.network.DemandForwardingBridgeSupport
| ActiveMQ InactivityMonitor Worker
2012-10-29 16:18:44,191 | INFO  | Establishing network connection from vm://zorin?async=false&network=true
to ssl://hub:61617 | org.apache.activemq.network.DiscoveryNetworkConnector | ActiveMQ Task-34
2012-10-29 16:18:46,192 | INFO  | zorin bridge to hub stopped | org.apache.activemq.network.DemandForwardingBridgeSupport
| ActiveMQ BrokerService[zorin] Task-9720
2012-10-29 16:18:46,487 | INFO  | Network connection between vm://zorin#150 and ssl://hub/x.x.x.x:61617(hub)
has been established. | org.apache.activemq.network.DemandForwardingBridgeSupport | StartLocalBridge:
localBroker=vm://zorin#150

Then this appears:
2012-10-29 16:21:26,338 | WARN  | TopicSubscription: consumer=ID:quarrel-40451-1351260922652-79:2:1:1,
destinations=707, dispatched=1000, delivered=19, matched=1001, discarded=0: has twice its
prefetch limit pending, without an ack; it appears to be slow | org.apache.activemq.broker.region.TopicSubscription
| ActiveMQ Transport: tcp:///127.0.0.1:41331@61612

Following this repeated socket reset errors can be seen intermixed with other normal operations.

                
> Semi-stuck queue until server restart
> -------------------------------------
>
>                 Key: AMQ-4141
>                 URL: https://issues.apache.org/jira/browse/AMQ-4141
>             Project: ActiveMQ
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 5.5.0, 5.7.0
>            Reporter: James Green
>            Priority: Critical
>
> We have a single Stomp consumer being invoked via a schedule with ten-second intervals
to consume at most 200 messages with a one second network timeout.
> The consumer subscribes to /queue/account_events on the localhost broker ("hub"). This
broker is connected to from three external brokers which have producers on them ("production
servers").
> What we see is a large built-up of messages on the remote brokers for this queue, with
a small number of messages pending on the hub. The messages are not persistent and tend to
be smallish (<1KB).
> The consumer is literally sat waiting for the next available message. Under "normal"
conditions it spins up and consumes 200 messages at a pace of a few msgs per second. Right
at this moment it goes dozens of seconds without a message then processes two-three inside
2 seconds.
> The hub's web console is showing 91 pending, 1 consumer. If I click to browse this queue,
there are no messages. The counter for 91 has been "stuck" for some time now.
> The hub has been set at debug level but there is only very recent logs recorded. A sample
of them grepping for "account_events" shows activity such as:
> 2012-10-29 15:25:50,016 | DEBUG | queue://account_events expiring messages .. | org.apache.activemq.broker.region.Queue
| ActiveMQ Broker[hub] Scheduler
> 2012-10-29 15:25:50,016 | DEBUG | account_events toPageIn: 91, Inflight: 0, pagedInMessages.size
0, enqueueCount: 27041, dequeueCount: 26950 | org.apache.activemq.broker.region.Queue | ActiveMQ
Broker[hub] Scheduler
> 2012-10-29 15:25:50,016 | DEBUG | account_events toPageIn: 91, Inflight: 0, pagedInMessages.size
0, enqueueCount: 27041, dequeueCount: 26950 | org.apache.activemq.broker.region.Queue | ActiveMQ
BrokerService[hub] Task-110
> 2012-10-29 15:25:50,020 | DEBUG | queue://account_events expiring messages done. | org.apache.activemq.broker.region.Queue
| ActiveMQ Broker[hub] Scheduler
> 2012-10-29 15:25:52,134 | DEBUG | queue://account_events expiring messages .. | org.apache.activemq.broker.region.Queue
| ActiveMQ Broker[hub] Scheduler
> 2012-10-29 15:25:52,134 | DEBUG | account_events toPageIn: 91, Inflight: 0, pagedInMessages.size
0, enqueueCount: 27041, dequeueCount: 26950 | org.apache.activemq.broker.region.Queue | ActiveMQ
Broker[hub] Scheduler
> 2012-10-29 15:25:52,134 | DEBUG | account_events toPageIn: 91, Inflight: 0, pagedInMessages.size
0, enqueueCount: 27041, dequeueCount: 26950 | org.apache.activemq.broker.region.Queue | ActiveMQ
BrokerService[hub] Task-112
> 2012-10-29 15:25:52,134 | DEBUG | queue://account_events expiring messages done. | org.apache.activemq.broker.region.Queue
| ActiveMQ Broker[hub] Scheduler
> 2012-10-29 15:26:20,016 | DEBUG | queue://account_events expiring messages .. | org.apache.activemq.broker.region.Queue
| ActiveMQ Broker[hub] Scheduler
> 2012-10-29 15:26:20,016 | DEBUG | account_events toPageIn: 91, Inflight: 0, pagedInMessages.size
0, enqueueCount: 27041, dequeueCount: 26950 | org.apache.activemq.broker.region.Queue | ActiveMQ
Broker[hub] Scheduler
> 2012-10-29 15:26:20,016 | DEBUG | account_events toPageIn: 91, Inflight: 0, pagedInMessages.size
0, enqueueCount: 27041, dequeueCount: 26950 | org.apache.activemq.broker.region.Queue | ActiveMQ
BrokerService[hub] Task-110
> 2012-10-29 15:26:20,017 | DEBUG | queue://account_events expiring messages done. | org.apache.activemq.broker.region.Queue
| ActiveMQ Broker[hub] Scheduler
> 2012-10-29 15:26:22,134 | DEBUG | queue://account_events expiring messages .. | org.apache.activemq.broker.region.Queue
| ActiveMQ Broker[hub] Scheduler
> 2012-10-29 15:26:22,134 | DEBUG | account_events toPageIn: 91, Inflight: 0, pagedInMessages.size
0, enqueueCount: 27041, dequeueCount: 26950 | org.apache.activemq.broker.region.Queue | ActiveMQ
Broker[hub] Scheduler
> 2012-10-29 15:26:22,134 | DEBUG | account_events toPageIn: 91, Inflight: 0, pagedInMessages.size
0, enqueueCount: 27041, dequeueCount: 26950 | org.apache.activemq.broker.region.Queue | ActiveMQ
BrokerService[hub] Task-110
> 2012-10-29 15:26:22,134 | DEBUG | queue://account_events expiring messages done. | org.apache.activemq.broker.region.Queue
| ActiveMQ Broker[hub] Scheduler
> 2012-10-29 15:26:35,687 | DEBUG | Unregistering MBean org.apache.activemq:BrokerName=hub,Type=Subscription,persistentMode=Non-Durable,destinationType=Queue,destinationName=account_events,clientId=account_events_receiver,consumerId=ID_billing-35190-1351511867558-2_1285_-1_1
| org.apache.activemq.broker.jmx.ManagementContext | ActiveMQ Transport: tcp:///127.0.0.1:50560@61612
> 2012-10-29 15:26:35,687 | DEBUG | hub removing consumer: ID:billing-35190-1351511867558-2:1285:-1:1
for destination: queue://account_events | org.apache.activemq.broker.region.AbstractRegion
| ActiveMQ Transport: tcp:///127.0.0.1:50560@61612
> 2012-10-29 15:26:35,688 | DEBUG | queue://account_events remove sub: QueueSubscription:
consumer=ID:billing-35190-1351511867558-2:1285:-1:1, destinations=1, dispatched=0, delivered=0,
pending=0, lastDeliveredSeqId: 0, dequeues: 26950, dispatched: 26950, inflight: 0 | org.apache.activemq.broker.region.Queue
| ActiveMQ Transport: tcp:///127.0.0.1:50560@61612
> 2012-10-29 15:26:35,688 | DEBUG | Unregistering MBean org.apache.activemq:BrokerName=hub,Type=Producer,destinationType=Dynamic,clientId=account_events_receiver,producerId=ID_billing-35190-1351511867558-2_1285_-1_1
| org.apache.activemq.broker.jmx.ManagementContext | ActiveMQ Transport: tcp:///127.0.0.1:50560@61612
> 2012-10-29 15:26:35,688 | DEBUG | Unregistering MBean org.apache.activemq:BrokerName=hub,Type=Connection,ConnectorName=stomp,Connection=account_events_receiver
| org.apache.activemq.broker.jmx.ManagementContext | ActiveMQ Transport: tcp:///127.0.0.1:50560@61612
> 2012-10-29 15:26:35,688 | DEBUG | bridging (hub -> drax) ID:billing-35190-1351511867558-1:1:0:0:80197,
consumer: ID:billing-35190-1351511867558-9:1:1:1, destination topic://ActiveMQ.Advisory.Consumer.Queue.account_events,
brokerPath: [ID:billing-35190-1351511867558-0:1], message: ActiveMQMessage {commandId = 0,
responseRequired = false, messageId = ID:billing-35190-1351511867558-1:1:0:0:80197, originalDestination
= null, originalTransactionId = null, producerId = ID:billing-35190-1351511867558-9:2:1:1,
destination = topic://ActiveMQ.Advisory.Consumer.Queue.account_events, transactionId = null,
expiration = 0, timestamp = 0, arrival = 0, brokerInTime = 1351524395688, brokerOutTime =
1351524395688, correlationId = null, replyTo = null, persistent = false, type = Advisory,
priority = 0, groupID = null, groupSequence = 0, targetConsumerId = null, compressed = false,
userID = null, content = null, marshalledProperties = null, dataStructure = RemoveInfo {commandId
= 0, responseRequired = false, objectId = ID:billing-35190-1351511867558-2:1285:-1:1, lastDeliveredSequenceId
= 0}, redeliveryCounter = 0, size = 1024, properties = {consumerCount=0, originBrokerURL=tcp://0.0.0.0:61616,
originBrokerId=ID:billing-35190-1351511867558-0:1, originBrokerName=hub}, readOnlyProperties
= false, readOnlyBody = false, droppable = false} | org.apache.activemq.network.DemandForwardingBridgeSupport
| ActiveMQ BrokerService[hub] Task-110
> 2012-10-29 15:26:35,689 | DEBUG | account_events toPageIn: 91, Inflight: 0, pagedInMessages.size
0, enqueueCount: 27041, dequeueCount: 26950 | org.apache.activemq.broker.region.Queue | ActiveMQ
BrokerService[hub] Task-112
> 2012-10-29 15:26:36,847 | DEBUG | Setting up new connection id: ID:billing-35190-1351511867558-2:1304,
address: tcp://127.0.0.1:50601, info: ConnectionInfo {commandId = 0, responseRequired = true,
connectionId = ID:billing-35190-1351511867558-2:1304, clientId = account_events_receiver,
clientIp = null, userName = , password = *****, brokerPath = null, brokerMasterConnector =
false, manageable = false, clientMaster = true, faultTolerant = false, failoverReconnect =
false} | org.apache.activemq.broker.TransportConnection | ActiveMQ Transport: tcp:///127.0.0.1:50601@61612
> 2012-10-29 15:26:36,887 | DEBUG | hub adding consumer: ID:billing-35190-1351511867558-2:1304:-1:1
for destination: queue://account_events | org.apache.activemq.broker.region.AbstractRegion
| ActiveMQ Transport: tcp:///127.0.0.1:50601@61612
> 2012-10-29 15:26:36,888 | DEBUG | queue://account_events add sub: QueueSubscription:
consumer=ID:billing-35190-1351511867558-2:1304:-1:1, destinations=0, dispatched=0, delivered=0,
pending=0, dequeues: 26950, dispatched: 26950, inflight: 0 | org.apache.activemq.broker.region.Queue
| ActiveMQ Transport: tcp:///127.0.0.1:50601@61612
> 2012-10-29 15:26:36,888 | DEBUG | account_events toPageIn: 91, Inflight: 0, pagedInMessages.size
0, enqueueCount: 27041, dequeueCount: 26950 | org.apache.activemq.broker.region.Queue | ActiveMQ
BrokerService[hub] Task-112
> 2012-10-29 15:26:36,889 | DEBUG | bridging (hub -> drax) ID:billing-35190-1351511867558-1:1:0:0:80200,
consumer: ID:billing-35190-1351511867558-9:1:1:1, destination topic://ActiveMQ.Advisory.Consumer.Queue.account_events,
brokerPath: [ID:billing-35190-1351511867558-0:1], message: ActiveMQMessage {commandId = 0,
responseRequired = false, messageId = ID:billing-35190-1351511867558-1:1:0:0:80200, originalDestination
= null, originalTransactionId = null, producerId = ID:billing-35190-1351511867558-9:2:1:1,
destination = topic://ActiveMQ.Advisory.Consumer.Queue.account_events, transactionId = null,
expiration = 0, timestamp = 0, arrival = 0, brokerInTime = 1351524396889, brokerOutTime =
1351524396889, correlationId = null, replyTo = null, persistent = false, type = Advisory,
priority = 0, groupID = null, groupSequence = 0, targetConsumerId = null, compressed = false,
userID = null, content = null, marshalledProperties = null, dataStructure = ConsumerInfo {commandId
= 3, responseRequired = false, consumerId = ID:billing-35190-1351511867558-2:1304:-1:1, destination
= queue://account_events, prefetchSize = 200, maximumPendingMessageLimit = 0, browser = false,
dispatchAsync = true, selector = null, subscriptionName = null, noLocal = false, exclusive
= false, retroactive = false, priority = 0, brokerPath = null, optimizedAcknowledge = false,
noRangeAcks = false, additionalPredicate = null}, redeliveryCounter = 0, size = 1024, properties
= {consumerCount=1, originBrokerURL=tcp://0.0.0.0:61616, originBrokerId=ID:billing-35190-1351511867558-0:1,
originBrokerName=hub}, readOnlyProperties = false, readOnlyBody = false, droppable = false}
| org.apache.activemq.network.DemandForwardingBridgeSupport | ActiveMQ BrokerService[hub]
Task-112
> 2012-10-29 15:26:50,017 | DEBUG | queue://account_events expiring messages .. | org.apache.activemq.broker.region.Queue
| ActiveMQ Broker[hub] Scheduler
> 2012-10-29 15:26:50,017 | DEBUG | account_events toPageIn: 91, Inflight: 0, pagedInMessages.size
0, enqueueCount: 27041, dequeueCount: 26950 | org.apache.activemq.broker.region.Queue | ActiveMQ
Broker[hub] Scheduler
> 2012-10-29 15:26:50,017 | DEBUG | account_events toPageIn: 91, Inflight: 0, pagedInMessages.size
0, enqueueCount: 27041, dequeueCount: 26950 | org.apache.activemq.broker.region.Queue | ActiveMQ
BrokerService[hub] Task-110
> 2012-10-29 15:26:50,017 | DEBUG | queue://account_events expiring messages done. | org.apache.activemq.broker.region.Queue
| ActiveMQ Broker[hub] Scheduler
> 2012-10-29 15:26:52,134 | DEBUG | queue://account_events expiring messages .. | org.apache.activemq.broker.region.Queue
| ActiveMQ Broker[hub] Scheduler
> 2012-10-29 15:26:52,134 | DEBUG | account_events toPageIn: 91, Inflight: 0, pagedInMessages.size
0, enqueueCount: 27041, dequeueCount: 26950 | org.apache.activemq.broker.region.Queue | ActiveMQ
Broker[hub] Scheduler
> 2012-10-29 15:26:52,134 | DEBUG | account_events toPageIn: 91, Inflight: 0, pagedInMessages.size
0, enqueueCount: 27041, dequeueCount: 26950 | org.apache.activemq.broker.region.Queue | ActiveMQ
BrokerService[hub] Task-110
> 2012-10-29 15:26:52,134 | DEBUG | queue://account_events expiring messages done. | org.apache.activemq.broker.region.Queue
| ActiveMQ Broker[hub] Scheduler
> Around the time (cannot be certain, was not looking) of "breakage" a 10MB message was
produced on this hub for consumption by each of the "production servers" via a topic. The
heap memory rose from 100-200MB (normal) to 300-450MB at this point before levelling off around
200MB. The number of threads actually halved before rising to normal. It is unknown if this
is a factor.
> Each "production server" correctly shows one consumer when it is connected from the hub.
> Continuing observations - restarting AMQ on the hub - a flood (7,000+) of messages have
now arrived according to the web console and the consumer is racing through them. The production
servers have cleared through their backlog having handed over to the hub.
> Some help isolating this with narrower debugging would be appreciated!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Mime
View raw message