activemq-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF subversion and git services (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ARTEMIS-1506) Synchronization issue during failover in ClientSessionImpl
Date Thu, 09 Nov 2017 16:58:00 GMT

    [ https://issues.apache.org/jira/browse/ARTEMIS-1506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16246019#comment-16246019
] 

ASF subversion and git services commented on ARTEMIS-1506:
----------------------------------------------------------

Commit 5cc8faedd8fd2e80975c2aa12f8f30c1724b9626 in activemq-artemis's branch refs/heads/master
from [~dudae]
[ https://git-wip-us.apache.org/repos/asf?p=activemq-artemis.git;h=5cc8fae ]

ARTEMIS-1506 Synchronization issue during failover in ClientSessionImpl

The temporary deadlock is avoided by removing 'synchronized' from
ClientSessionImpl::getCredits method. As the method uses only
a producerCreditManger, only this object is guarded against
the parallel access.


> Synchronization issue during failover in ClientSessionImpl
> ----------------------------------------------------------
>
>                 Key: ARTEMIS-1506
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-1506
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 2.3.0
>            Reporter: Erich Duda
>
> This issue was hit in test {{MultiThreadRandomReattachTest}}. There are several client's
threads doing some work, while connection fail is simulated. The test expects that all threads
finish without exceptions.
> This issue causes that some client's threads sometime fail with an exception {{AMQ119014:
Timed out after waiting 30,000 ms for response when sending packet XXX}}.
> I found out that the mentioned exception is caused by temporary deadlock during doing
failover on client's side. These two threads block each other.
> {code}
> "Thread-7" Id=29 TIMED_WAITING on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1d03220
> 	at sun.misc.Unsafe.park(Native Method)
> 	-  waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1d03220
> 	at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
> 	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2163)
> 	at org.apache.activemq.artemis.utils.ConcurrentUtil.await(ConcurrentUtil.java:37)
> 	at org.apache.activemq.artemis.core.protocol.core.impl.ChannelImpl.waitForFailOver(ChannelImpl.java:256)
> 	at org.apache.activemq.artemis.core.protocol.core.impl.ChannelImpl.send(ChannelImpl.java:283)
> 	-  locked java.lang.Object@938196
> 	at org.apache.activemq.artemis.core.protocol.core.impl.ChannelImpl.send(ChannelImpl.java:229)
> 	at org.apache.activemq.artemis.core.protocol.core.impl.ActiveMQSessionContext.sendProducerCreditsMessage(ActiveMQSessionContext.java:421)
> 	at org.apache.activemq.artemis.core.client.impl.ClientSessionImpl.sendProducerCreditsMessage(ClientSessionImpl.java:1342)
> 	at org.apache.activemq.artemis.core.client.impl.ClientProducerCreditsImpl.requestCredits(ClientProducerCreditsImpl.java:209)
> 	at org.apache.activemq.artemis.core.client.impl.ClientProducerCreditsImpl.checkCredits(ClientProducerCreditsImpl.java:204)
> 	at org.apache.activemq.artemis.core.client.impl.ClientProducerCreditsImpl.init(ClientProducerCreditsImpl.java:71)
> 	at org.apache.activemq.artemis.core.client.impl.ClientProducerCreditManagerImpl.getCredits(ClientProducerCreditManagerImpl.java:79)
> 	-  locked org.apache.activemq.artemis.core.client.impl.ClientProducerCreditManagerImpl@f7a5dc
> 	at org.apache.activemq.artemis.core.client.impl.ClientSessionImpl.getCredits(ClientSessionImpl.java:1347)
> 	-  locked org.apache.activemq.artemis.core.client.impl.ClientSessionImpl@10867c8
> 	at org.apache.activemq.artemis.core.client.impl.ClientProducerImpl.<init>(ClientProducerImpl.java:102)
> 	at org.apache.activemq.artemis.core.client.impl.ClientSessionImpl.internalCreateProducer(ClientSessionImpl.java:1817)
> 	at org.apache.activemq.artemis.core.client.impl.ClientSessionImpl.createProducer(ClientSessionImpl.java:740)
> 	at org.apache.activemq.artemis.core.client.impl.ClientSessionImpl.createProducer(ClientSessionImpl.java:730)
> 	at org.apache.activemq.artemis.tests.integration.cluster.reattach.MultiThreadRandomReattachTestBase.doTestB(MultiThreadRandomReattachTestBase.java:398)
> 	at org.apache.activemq.artemis.tests.integration.cluster.reattach.MultiThreadRandomReattachTestBase$2.run(MultiThreadRandomReattachTestBase.java:84)
> 	at org.apache.activemq.artemis.tests.integration.cluster.reattach.MultiThreadReattachSupportTestBase$1Runner.run(MultiThreadReattachSupportTestBase.java:104)
> {code}
> {code}
> "Timer-0" Id=9 BLOCKED on org.apache.activemq.artemis.core.client.impl.ClientSessionImpl@10867c8
owned by "Thread-7" Id=29
> 	at org.apache.activemq.artemis.core.client.impl.ClientSessionImpl.handleFailover(ClientSessionImpl.java:1206)
> 	-  blocked on org.apache.activemq.artemis.core.client.impl.ClientSessionImpl@10867c8
> 	at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.reconnectSessions(ClientSessionFactoryImpl.java:771)
> 	at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.failoverOrReconnect(ClientSessionFactoryImpl.java:614)
> 	at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.handleConnectionFailure(ClientSessionFactoryImpl.java:504)
> 	at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl.access$600(ClientSessionFactoryImpl.java:72)
> 	at org.apache.activemq.artemis.core.client.impl.ClientSessionFactoryImpl$DelegatingFailureListener.connectionFailed(ClientSessionFactoryImpl.java:1175)
> 	at org.apache.activemq.artemis.spi.core.protocol.AbstractRemotingConnection.callFailureListeners(AbstractRemotingConnection.java:70)
> 	at org.apache.activemq.artemis.core.protocol.core.impl.RemotingConnectionImpl.fail(RemotingConnectionImpl.java:209)
> 	at org.apache.activemq.artemis.spi.core.protocol.AbstractRemotingConnection.fail(AbstractRemotingConnection.java:213)
> 	at org.apache.activemq.artemis.tests.integration.cluster.reattach.MultiThreadReattachSupportTestBase$Failer.run(MultiThreadReattachSupportTestBase.java:220)
> 	-  locked org.apache.activemq.artemis.tests.integration.cluster.reattach.MultiThreadReattachSupportTestBase$Failer@16d859
> 	at java.util.TimerThread.mainLoop(Timer.java:555)
> 	at java.util.TimerThread.run(Timer.java:505)
> 	Number of locked synchronizers = 1
> 	- java.util.concurrent.locks.ReentrantLock$NonfairSync@1ac8dee
> {code}
> The first thread holds {{ClientSessionImpl}} lock in method {{getCredits}} which tries
to send a packet and thus it waits until the connection is reconnected or do failover to backup.
> {code:java}
> public final class ClientSessionImpl implements ClientSessionInternal, FailureListener
{
>    @Override
>    public synchronized ClientProducerCredits getCredits(final SimpleString address, final
boolean anon) {
>       ClientProducerCredits credits = producerCreditManager.getCredits(address, anon,
sessionContext);
>       return credits;
>    }
> }
> {code}
> The second thread is responsible for handling the connection failure and doing the  re-connection
or failover. However it is blocked by the first thread because it requires {{ClientSessionImpl}}
lock.
> {code:java}
> public final class ClientSessionImpl implements ClientSessionInternal, FailureListener
{
>    @Override
>    public void handleFailover(final RemotingConnection backupConnection, ActiveMQException
cause) {
>       synchronized (this) {
>          if (closed) {
>             return;
>          }
>          ...
>    }
> }
> {code}
> This situation lasts until some other thread throws an exception because it doesn't receive
response for its blocking packet, as the the connection was not reconnected.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message