activemq-issues mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "ASF GitHub Bot (JIRA)" <j...@apache.org>
Subject [jira] [Commented] (ARTEMIS-1368) Artemis gets to state when it doesn't respond to producer
Date Wed, 23 Aug 2017 12:34:01 GMT

    [ https://issues.apache.org/jira/browse/ARTEMIS-1368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16138287#comment-16138287
] 

ASF GitHub Bot commented on ARTEMIS-1368:
-----------------------------------------

Github user mtaylor commented on the issue:

    https://github.com/apache/activemq-artemis/pull/1486
  
    @michaelandrepearce I have noticed those errors before.  I suspected environment issues
in the Apache CI infra. But we should try and get them resolved.  Having intermittent failures
makes it harder to spot new ones.  Thanks for the nudge.


> Artemis gets to state when it doesn't respond to producer
> ---------------------------------------------------------
>
>                 Key: ARTEMIS-1368
>                 URL: https://issues.apache.org/jira/browse/ARTEMIS-1368
>             Project: ActiveMQ Artemis
>          Issue Type: Bug
>          Components: Broker
>    Affects Versions: 1.5.5, 2.2.0
>            Reporter: Erich Duda
>
> *Scenario*
> * There are two servers configured in colocated replicated HA
> * There are two producers. Each one sends messages on different server to InQueue.
> * There is MDB on server 2 which resends messages from InQueue to OutQueue
> * During the resending of messages, server 2 is restarted.
> * After all messages are resent, receiver is connected to server 1 and it receives all
messages.
> *Expectation:* All messages sent by producers to InQueue are received by receiver from
OutQueue.
> *Reality:* After the restart of server 2, the server 1 gets into the state when it stops
to respond to the producer. Producer sends a bulk of messages which are marked as duplicates
by server, but the exception packet is not sent to producer. See below for more detailed description
what is happening on the server.
> *Customer impact:* Artemis may get into the state when it is not able to work properly.
This can lead to unavailability of service.
> This is *regression* against *7.0.z*.
> The issue wasn't reported earlier, because the test failed due to JBEAP-7968 before 7.1.0.ER3.
> *Detail description of what happened on server*
> I looked at the server traces to see what happened when server received session commit
packet from producer. Based on the traces, the server behaved correctly and it even scheduled
to send response packet with the duplication exception.
> {code}
> 06:17:28,206 TRACE [org.apache.activemq.artemis.core.protocol.core.ServerSessionPacketHandler]
(Thread-6 (ActiveMQ-server-org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl$3@37386953))
ServerSessio
> nPacketHandler::scheduling response::PACKET(ActiveMQExceptionMessage)[type=20, channelID=0,
packetObject=ActiveMQExceptionMessage, exception= ActiveMQDuplicateIdException[errorType=DUPLICATE_ID_REJECTED
message=
> Duplicate message detected - message will not be routed. Message information:LargeServerMessage[messageID=7034,durable=true,userID=f3b4e359-7834-11e7-bd40-001b217d6dc3,priority=4,
timestamp=Thu Aug 03 06:17:28 E
> DT 2017,expiration=0, durable=true, address=jms.queue.InQueue,properties=TypedProperties[__AMQ_CID=829bf1a6-7834-11e7-bd40-001b217d6dc3,count=1699,counter=1700,_AMQ_DUPL_ID=1bfb61bf-2eca-4f33-9723-5998dcd84ed515
> 01755380929,_AMQ_LARGE_SIZE=409615,color=RED]]@971671687]]
> {code}
> The problem is that after this event, I cannot find message which would say that the
packet was sent. As you can see in code snippet below, the "scheduling response" says that
it was registered IOCallback, which will send the packet once it is triggered.
> {code:java}
> private void sendResponse(final Packet confirmPacket,
>                              final Packet response,
>                              final boolean flush,
>                              final boolean closeChannel) {
>       if (logger.isTraceEnabled()) {
>          logger.trace("ServerSessionPacketHandler::scheduling response::" + response);
>       }
>       storageManager.afterCompleteOperations(new IOCallback() {
>          @Override
>          public void onError(final int errorCode, final String errorMessage) {
>             ActiveMQServerLogger.LOGGER.errorProcessingIOCallback(errorCode, errorMessage);
>             ActiveMQExceptionMessage exceptionMessage = new ActiveMQExceptionMessage(ActiveMQExceptionType.createException(errorCode,
errorMessage));
>             doConfirmAndResponse(confirmPacket, exceptionMessage, flush, closeChannel);
>             if (logger.isTraceEnabled()) {
>                logger.trace("ServerSessionPacketHandler::exception response sent::" +
exceptionMessage);
>             }
>          }
>          @Override
>          public void done() {
>             if (logger.isTraceEnabled()) {
>                logger.trace("ServerSessionPacketHandler::regular response sent::" + response);
>             }
>             doConfirmAndResponse(confirmPacket, response, flush, closeChannel);
>          }
>       });
>    }
> {code}
> It is odd that the callback wasn't never triggered. This assumption confirms the warning
printed at stopping of the server. In the OperationContext, there were still some callbacks
which weren't triggered.
> {code}
> 06:22:41,377 WARN  [org.apache.activemq.artemis.core.server] (ServerService Thread Pool
-- 124) AMQ222105: Could not finish context execution in 10 seconds: java.lang.Exception:
warning
>         at org.apache.activemq.artemis.core.server.impl.ServerSessionImpl.waitContextCompletion(ServerSessionImpl.java:1141)
[artemis-server-1.5.5.006-redhat-1.jar:1.5.5.006-redhat-1]
>         at org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.closeAllServerSessions(ActiveMQServerImpl.java:1103)
[artemis-server-1.5.5.006-redhat-1.jar:1.5.5.006-redhat-1]
>         at org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stop(ActiveMQServerImpl.java:888)
[artemis-server-1.5.5.006-redhat-1.jar:1.5.5.006-redhat-1]
>         at org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stop(ActiveMQServerImpl.java:793)
[artemis-server-1.5.5.006-redhat-1.jar:1.5.5.006-redhat-1]
>         at org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.internalStop(ActiveMQServerImpl.java:688)
[artemis-server-1.5.5.006-redhat-1.jar:1.5.5.006-redhat-1]
>         at org.apache.activemq.artemis.core.server.impl.ActiveMQServerImpl.stop(ActiveMQServerImpl.java:681)
[artemis-server-1.5.5.006-redhat-1.jar:1.5.5.006-redhat-1]
>         at org.apache.activemq.artemis.jms.server.impl.JMSServerManagerImpl.stop(JMSServerManagerImpl.java:433)
[artemis-jms-server-1.5.5.006-redhat-1.jar:1.5.5.006-redhat-1]
>         at org.wildfly.extension.messaging.activemq.jms.JMSService.doStop(JMSService.java:217)
[wildfly-messaging-activemq-7.1.0.GA-redhat-4.jar:7.1.0.GA-redhat-4]
>         at org.wildfly.extension.messaging.activemq.jms.JMSService.access$100(JMSService.java:64)
[wildfly-messaging-activemq-7.1.0.GA-redhat-4.jar:7.1.0.GA-redhat-4]
>         at org.wildfly.extension.messaging.activemq.jms.JMSService$2.run(JMSService.java:121)
[wildfly-messaging-activemq-7.1.0.GA-redhat-4.jar:7.1.0.GA-redhat-4]
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [rt.jar:1.8.0_131]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266) [rt.jar:1.8.0_131]
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
[rt.jar:1.8.0_131]
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
[rt.jar:1.8.0_131]
>         at java.lang.Thread.run(Thread.java:748) [rt.jar:1.8.0_131]
>         at org.jboss.threads.JBossThread.run(JBossThread.java:320) [jboss-threads-2.2.1.Final-redhat-1.jar:2.2.1.Final-redhat-1]
> {code}
> In the attachment you can find also thread dumps from the server. I didn't find there
any deadlocks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Mime
View raw message