Return-Path: X-Original-To: apmail-activemq-issues-archive@minotaur.apache.org Delivered-To: apmail-activemq-issues-archive@minotaur.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D8B9E19D64 for ; Wed, 13 Apr 2016 11:44:30 +0000 (UTC) Received: (qmail 86644 invoked by uid 500); 13 Apr 2016 11:44:26 -0000 Delivered-To: apmail-activemq-issues-archive@activemq.apache.org Received: (qmail 86562 invoked by uid 500); 13 Apr 2016 11:44:26 -0000 Mailing-List: contact issues-help@activemq.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@activemq.apache.org Delivered-To: mailing list issues@activemq.apache.org Received: (qmail 86547 invoked by uid 99); 13 Apr 2016 11:44:25 -0000 Received: from arcas.apache.org (HELO arcas) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Wed, 13 Apr 2016 11:44:25 +0000 Received: from arcas.apache.org (localhost [127.0.0.1]) by arcas (Postfix) with ESMTP id A3A3A2C1F56 for ; Wed, 13 Apr 2016 11:44:25 +0000 (UTC) Date: Wed, 13 Apr 2016 11:44:25 +0000 (UTC) From: "ASF GitHub Bot (JIRA)" To: issues@activemq.apache.org Message-ID: In-Reply-To: References: Subject: [jira] [Commented] (ARTEMIS-480) [Artemis Testsuite] BridgeReconnectTest.testDeliveringCountOnBridgeConnectionFailure fails due to racing condition MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/ARTEMIS-480?page=3Dcom.atlassia= n.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=3D152= 39102#comment-15239102 ]=20 ASF GitHub Bot commented on ARTEMIS-480: ---------------------------------------- GitHub user iweiss opened a pull request: https://github.com/apache/activemq-artemis/pull/459 ARTEMIS-480 BridgeReconnectTest.testDeliveringCountOnBridgeConnection= =E2=80=A6 =E2=80=A6Failure fails due to racing condition =20 Signed-off-by: Ingo Weiss You can merge this pull request into a Git repository by running: $ git pull https://github.com/iweiss/activemq-artemis master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/activemq-artemis/pull/459.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #459 =20 ---- commit 231ef0b8c17a345301ae0a3112e1e50d035eac5e Author: Ingo Weiss Date: 2016-04-13T11:42:38Z ARTEMIS-480 BridgeReconnectTest.testDeliveringCountOnBridgeConnectionFa= ilure fails due to racing condition =20 Signed-off-by: Ingo Weiss ---- > [Artemis Testsuite] BridgeReconnectTest.testDeliveringCountOnBridgeConnec= tionFailure fails due to racing condition > -------------------------------------------------------------------------= ----------------------------------------- > > Key: ARTEMIS-480 > URL: https://issues.apache.org/jira/browse/ARTEMIS-480 > Project: ActiveMQ Artemis > Issue Type: Bug > Components: Broker > Affects Versions: 1.1.0, 1.2.0, 1.3.0 > Reporter: Ingo Weiss > > {code} > java.lang.AssertionError: Delivering count of a source queue should be ze= ro on connection failure expected:<0> but was:<1> > =09at org.junit.Assert.fail(Assert.java:88) > =09at org.junit.Assert.failNotEquals(Assert.java:743) > =09at org.junit.Assert.assertEquals(Assert.java:118) > =09at org.junit.Assert.assertEquals(Assert.java:555) > =09at org.apache.activemq.artemis.tests.integration.cluster.bridge.Bridge= ReconnectTest.testDeliveringCountOnBridgeConnectionFailure(BridgeReconnectT= est.java:688) > {code} > {code} > 18:25:43,722 WARN [org.apache.activemq.artemis.core.server] AMQ222094: B= ridge unable to send message Reference[22]:NON-RELIABLE:ServerMessage[messa= geID=3D22,durable=3Dfalse,userID=3Dnull,priority=3D4, bodySize=3D79, timest= amp=3DTue Feb 02 18:25:43 EST 2016,expiration=3D0, durable=3Dfalse, address= =3DtestAddress,properties=3DTypedProperties[propkey=3D18,_AMQ_BRIDGE_DUP=3D= [47A6 779A CA04 11E5 9C91 A169 FCCB 5522 0000 0000 0000 0016)]]@1861263739,= will try again once bridge reconnects: ActiveMQObjectClosedException[error= Type=3DOBJECT_CLOSED message=3DAMQ119018: Producer is closed] > =09at org.apache.activemq.artemis.core.client.impl.ClientProducerImpl.che= ckClosed(ClientProducerImpl.java:298) [artemis-core-client-1.1.0.wildfly-01= 2.jar:] > =09at org.apache.activemq.artemis.core.client.impl.ClientProducerImpl.sen= d(ClientProducerImpl.java:122) [artemis-core-client-1.1.0.wildfly-012.jar:] > =09at org.apache.activemq.artemis.core.server.cluster.impl.BridgeImpl.del= iverStandardMessage(BridgeImpl.java:698) [artemis-server-1.1.0.wildfly-012.= jar:] > =09at org.apache.activemq.artemis.core.server.cluster.impl.BridgeImpl.han= dle(BridgeImpl.java:574) [artemis-server-1.1.0.wildfly-012.jar:] > =09at org.apache.activemq.artemis.core.server.impl.QueueImpl.handle(Queue= Impl.java:2410) [artemis-server-1.1.0.wildfly-012.jar:] > =09at org.apache.activemq.artemis.core.server.impl.QueueImpl.deliver(Queu= eImpl.java:1813) [artemis-server-1.1.0.wildfly-012.jar:] > =09at org.apache.activemq.artemis.core.server.impl.QueueImpl.access$1400(= QueueImpl.java:97) [artemis-server-1.1.0.wildfly-012.jar:] > =09at org.apache.activemq.artemis.core.server.impl.QueueImpl$DeliverRunne= r.run(QueueImpl.java:2581) [artemis-server-1.1.0.wildfly-012.jar:] > =09at org.apache.activemq.artemis.utils.OrderedExecutorFactory$OrderedExe= cutor$ExecutorTask.run(OrderedExecutorFactory.java:100) [artemis-core-clien= t-1.1.0.wildfly-012.jar:] > =09at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecuto= r.java:1153) [rt.jar:1.8.0-internal] > =09at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecut= or.java:628) [rt.jar:1.8.0-internal] > =09at java.lang.Thread.run(Thread.java:785) [vm.jar:1.8.0-internal] > {code} > I've investigated this issue and I found the race condition which causes = mentioned fail. Problem lies in \[1\]. When bridge detects some problem, it= calls {{connectionFailed}} method which call for every message in {{refs}}= the {{Queue.cancel(ref, timeBase)}}. {{Queue.cancel}} decreases {{delivery= Count}} for canceled message. However before this step, we remove reference= on actual message from {{refs}} on line 20, so for this message the {{deli= veryCount}} is not decreased. This is correct behavior, because for this me= ssage we return {{HandleStatus.BUSY}}. I think that problem is in {{QueueIm= pl#deliver}} method. If bridge returns {{HandleStatus.BUSY}} we should decr= ease {{deliveryCount}}. So I think that instead of \[2\], there should be \= [3\]. > \[1\] > {code:language=3Djava|linenumbers=3Dtrue} > private BridgeImpl#HandleStatus deliverStandardMessage(SimpleString dest,= final MessageReference ref, ServerMessage message) { > // if we failover during send then there is a chance that the > // that this will throw a disconnect, we need to remove the message > // from the acks so it will get resent, duplicate detection will co= pe > // with any messages resent > if (ActiveMQServerLogger.LOGGER.isTraceEnabled()) { > ActiveMQServerLogger.LOGGER.trace("going to send message: " + me= ssage + " from " + this.getQueue()); > } > try { > producer.send(dest, message); > } > catch (final ActiveMQException e) { > ActiveMQServerLogger.LOGGER.bridgeUnableToSendMessage(e, ref); > synchronized (refs) { > // We remove this reference as we are returning busy which me= ans the reference will never leave the Queue. > // because of this we have to remove the reference here > refs.remove(message.getMessageID()); > } > connectionFailed(e, false); > return HandleStatus.BUSY; > } > return HandleStatus.HANDLED; > } > {code} > \[2\] > {code} > else if (status =3D=3D HandleStatus.BUSY) { > holder.iter.repeat(); > noDelivery++; > } > {code} > \[3\] > {code} > else if (status =3D=3D HandleStatus.BUSY) { > decDelivering(); > holder.iter.repeat(); > noDelivery++; > } > {code} > Steps to reproduce: > 1. {{cd tests}} > 2. {{while true; do mvn -Dtest=3DBridgeReconnectTest#testDeliveringCountO= nBridgeConnectionFailure -Ptests -DfailIfNoTests=3Dfalse test; done}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)