activemq-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Gary Tully <gary.tu...@gmail.com>
Subject Re: Network bridge is stuck, repeatedly requires broker restart
Date Thu, 01 Nov 2012 11:55:02 GMT
actually soWriteTimeout does not help with ssl, because the ssl stream
has an exclusive lock that blocks any call to close.

you will have to deal with the blocked write at the OS level via the
tcp stack timeout configuration.

But the real root cause is understanding why the connection drops in
the first place.

Do you see the thread blocked on a write eventually timing out when
the OS detects the half closed socket? It should do once a remote
close has been initiated.

Or is the problem that the hub/spoke does not really close the
connection when it sees a problem?

It is odd that it is a 24hr thing,  what happens at that time.

On 1 November 2012 11:27, James Green <james.mk.green@gmail.com> wrote:
> Having implemented queue and destination filtering between spokes and the
> hub, the connections lasted just over 24 hours.
>
> It's just died. Same traces on the hub as before. On the spoke with which
> comms have died:
>
> Name: ActiveMQ Transport: ssl://hub/x.x.x.x:61617
> State: WAITING on
> java.util.concurrent.locks.ReentrantLock$NonfairSync@4864ae35 owned by:
> ActiveMQ BrokerService[zorin] Task-3350
> Total blocked: 68  Total waited: 1,862
>
> Stack trace:
> sun.misc.Unsafe.park(Native Method)
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:838)
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:871)
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1201)
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:214)
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:290)
> org.apache.activemq.transport.MutexTransport.oneway(MutexTransport.java:66)
> org.apache.activemq.transport.ResponseCorrelator.oneway(ResponseCorrelator.java:60)
> org.apache.activemq.network.DemandForwardingBridgeSupport.serviceRemoteCommand(DemandForwardingBridgeSupport.java:467)
> org.apache.activemq.network.DemandForwardingBridgeSupport$2.onCommand(DemandForwardingBridgeSupport.java:179)
> org.apache.activemq.transport.ResponseCorrelator.onCommand(ResponseCorrelator.java:116)
> org.apache.activemq.transport.MutexTransport.onCommand(MutexTransport.java:50)
> org.apache.activemq.transport.WireFormatNegotiator.onCommand(WireFormatNegotiator.java:113)
> org.apache.activemq.transport.AbstractInactivityMonitor.onCommand(AbstractInactivityMonitor.java:256)
> org.apache.activemq.transport.TransportSupport.doConsume(TransportSupport.java:83)
> org.apache.activemq.transport.tcp.SslTransport.doConsume(SslTransport.java:91)
> org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:222)
> org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:204)
> java.lang.Thread.run(Thread.java:636)
>
> Name: ActiveMQ BrokerService[zorin] Task-3350
> State: RUNNABLE
> Total blocked: 0  Total waited: 668
>
> Stack trace:
> java.net.SocketOutputStream.socketWrite0(Native Method)
> java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:109)
> java.net.SocketOutputStream.write(SocketOutputStream.java:153)
> sun.security.ssl.OutputRecord.writeBuffer(OutputRecord.java:314)
> sun.security.ssl.OutputRecord.write(OutputRecord.java:303)
> sun.security.ssl.SSLSocketImpl.writeRecordInternal(SSLSocketImpl.java:763)
> sun.security.ssl.SSLSocketImpl.writeRecord(SSLSocketImpl.java:751)
> sun.security.ssl.AppOutputStream.write(AppOutputStream.java:78)
>    - locked sun.security.ssl.AppOutputStream@3109e59b
> org.apache.activemq.transport.tcp.TcpBufferedOutputStream.write(TcpBufferedOutputStream.java:96)
> java.io.DataOutputStream.write(DataOutputStream.java:107)
>    - locked java.io.DataOutputStream@c13c44
> org.apache.activemq.openwire.v9.BaseDataStreamMarshaller.tightMarshalByteSequence2(BaseDataStreamMarshaller.java:431)
> org.apache.activemq.openwire.v9.MessageMarshaller.tightMarshal2(MessageMarshaller.java:180)
> org.apache.activemq.openwire.v9.ActiveMQMessageMarshaller.tightMarshal2(ActiveMQMessageMarshaller.java:89)
> org.apache.activemq.openwire.v9.ActiveMQTextMessageMarshaller.tightMarshal2(ActiveMQTextMessageMarshaller.java:89)
> org.apache.activemq.openwire.OpenWireFormat.marshal(OpenWireFormat.java:243)
>    - locked org.apache.activemq.openwire.OpenWireFormat@e5b94bf
> org.apache.activemq.transport.tcp.TcpTransport.oneway(TcpTransport.java:183)
> org.apache.activemq.transport.AbstractInactivityMonitor.doOnewaySend(AbstractInactivityMonitor.java:289)
> org.apache.activemq.transport.AbstractInactivityMonitor.oneway(AbstractInactivityMonitor.java:271)
> org.apache.activemq.transport.TransportFilter.oneway(TransportFilter.java:85)
> org.apache.activemq.transport.WireFormatNegotiator.oneway(WireFormatNegotiator.java:104)
> org.apache.activemq.transport.MutexTransport.oneway(MutexTransport.java:68)
> org.apache.activemq.transport.ResponseCorrelator.oneway(ResponseCorrelator.java:60)
> org.apache.activemq.network.DemandForwardingBridgeSupport.serviceLocalCommand(DemandForwardingBridgeSupport.java:726)
> org.apache.activemq.network.DemandForwardingBridgeSupport$1.onCommand(DemandForwardingBridgeSupport.java:167)
> org.apache.activemq.transport.ResponseCorrelator.onCommand(ResponseCorrelator.java:116)
> org.apache.activemq.transport.MutexTransport.onCommand(MutexTransport.java:50)
> org.apache.activemq.transport.vm.VMTransport.doDispatch(VMTransport.java:138)
> org.apache.activemq.transport.vm.VMTransport.dispatch(VMTransport.java:127)
>    - locked java.util.concurrent.atomic.AtomicBoolean@1e828ef6
> org.apache.activemq.transport.vm.VMTransport.oneway(VMTransport.java:104)
> org.apache.activemq.transport.MutexTransport.oneway(MutexTransport.java:68)
> org.apache.activemq.transport.ResponseCorrelator.oneway(ResponseCorrelator.java:60)
> org.apache.activemq.broker.TransportConnection.dispatch(TransportConnection.java:1312)
> org.apache.activemq.broker.TransportConnection.processDispatch(TransportConnection.java:838)
> org.apache.activemq.broker.TransportConnection.iterate(TransportConnection.java:873)
> org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:129)
> org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:47)
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
> java.lang.Thread.run(Thread.java:636)
>
> Is this something that should be reported in JIRA? And should we be
> changing the writeTimeout you speak of? If so, to what?
>
> Thanks,
>
> James
>
> On 31 October 2012 16:57, Gary Tully <gary.tully@gmail.com> wrote:
>
>> from a quick look at one of those stack traces, it looks like a thread
>> is blocked on a socket write, it may be that the socket is half closed
>> and that the write needs to timeout at the tcp level. that can take
>> some time.
>>
>> There is a writeTimeout filter that can help at the activemq level.
>>
>> Have a peek at https://issues.apache.org/jira/browse/AMQ-1993
>>
>>
>> On 30 October 2012 21:31, James Green <james.mk.green@gmail.com> wrote:
>> > This is happening repeatedly causing memory consumption to balloon
>> locally
>> > and message queues to build remotely.
>> >
>> > Googling around isn't revealing much. Can anyone take a look and tell me
>> if
>> > this stuff is normal or not? Need to narrow down the investigations.
>> >
>> > Thanks,
>> >
>> > James
>> >
>> > On 30 October 2012 11:32, James Green <james.mk.green@gmail.com> wrote:
>> >
>> >> Hub with three spokes. The network between the hub and one of these
>> spokes
>> >> seems to be dying irregularly yet frequently.
>> >>
>> >> I managed to get a thread dump of the hub's ssl connector to this spoke
>> >> along with the thread it was waiting on:
>> >>
>> >> http://pastebin.com/ZjeSxDK7
>> >>
>> >> I also captched a full thread dump on the hub:
>> >>
>> >> http://pastebin.com/Qu113JhQ
>> >>
>> >> On the spoke the thread dump button was disabled but I captured the
>> >> equivalent individual threads:
>> >>
>> >> http://pastebin.com/1hwb1BDp
>> >>
>> >> FWIW we just restarted the spoke and the network connector appears to be
>> >> stuck once again. Not sure what we can actually do now...
>> >>
>> >> Can anyone offer advice, perhaps a JIRA issue? This is with 5.7.0.
>> >>
>> >> James
>> >>
>> >>
>>
>>
>>
>> --
>> http://redhat.com
>> http://blog.garytully.com
>>



-- 
http://redhat.com
http://blog.garytully.com

Mime
View raw message