From dev-return-34390-apmail-activemq-dev-archive=activemq.apache.org@activemq.apache.org Fri Nov 2 19:17:13 2012 Return-Path: X-Original-To: apmail-activemq-dev-archive@www.apache.org Delivered-To: apmail-activemq-dev-archive@www.apache.org Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by minotaur.apache.org (Postfix) with SMTP id D7A73D5D8 for ; Fri, 2 Nov 2012 19:17:13 +0000 (UTC) Received: (qmail 93596 invoked by uid 500); 2 Nov 2012 19:17:13 -0000 Delivered-To: apmail-activemq-dev-archive@activemq.apache.org Received: (qmail 93535 invoked by uid 500); 2 Nov 2012 19:17:13 -0000 Mailing-List: contact dev-help@activemq.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: dev@activemq.apache.org Delivered-To: mailing list dev@activemq.apache.org Received: (qmail 93475 invoked by uid 99); 2 Nov 2012 19:17:13 -0000 Received: from arcas.apache.org (HELO arcas.apache.org) (140.211.11.28) by apache.org (qpsmtpd/0.29) with ESMTP; Fri, 02 Nov 2012 19:17:13 +0000 Date: Fri, 2 Nov 2012 19:17:13 +0000 (UTC) From: "Martin Serrano (JIRA)" To: dev@activemq.apache.org Message-ID: <1591291192.62423.1351883833268.JavaMail.jiratomcat@arcas> In-Reply-To: <1458641652.62376.1351883235390.JavaMail.jiratomcat@arcas> Subject: [jira] [Updated] (AMQ-4157) KahaDBTransactionStore.removeAyncMessage may cancel addMessage when in transaction leading to unpersisted messages MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-JIRA-FingerPrint: 30527f35849b9dde25b450d4833f0394 [ https://issues.apache.org/jira/browse/AMQ-4157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Martin Serrano updated AMQ-4157: -------------------------------- Description: This was very difficult to track down. It rarely occurs because a certain set of events must be occurring to trigger the bug. I have marked it a Blocker because when it does occur, it is silent and leads to a message not being persisted in the MessageStore. *Description* The crux of the bug is that when a rollback on a session occurs, the resulting MessageAck can overlap with the async store of the message in the KahaDB. When this occurs, the message is never persisted. Additionally, the resultant {{CancellationException}} is ignored in o.a.a.broker.region.Queue:796. The steps: # a StoreQueueTask is created to add a message X. this is put on the async task queue # meanwhile this message is dispatched via a prefetch subscription to a transacted consumer. # the transacted consumer calls session.rollback # this leads to acknowledgement of the dispatched messages # as a result destination.removeAsyncMessage is called # if the original add has not yet executed then it will be cancelled leading to the message never being persisted! (occurs at KahaDBStore:401) # the Queue.send method uses the result future to make sure the persist happens in the store, but it ignores cancellation, so this can lead execution control to return to the sender when no persistence has occurred without an error. I have not been able to reproduce this in a small activemq-only test. But I can reproduce it in my environment. *Proposed Solutions* I'm really unsure of the solution here. Should {{KahaDBStore.removeAsyncMessage}} (line 393) check the context and only cancel tasks if it is not in a transaction context? But what would that mean in the log? Would there be a removeMessage prior to the addMessage? *Workaround* * turn off caching for the destination (see [dest policies|http://activemq.apache.org/per-destination-policies.html]). this will cause messages to be added to the synchronously so they will not be subject to the async cancellation was: This was very difficult to track down. It rarely occurs because a certain set of events must be occurring to trigger the bug. I have marked it a Blocker because when it does occur, it is silent and leads to a message not being persisted in the MessageStore. *Description* The crux of the bug is that when a rollback on a session occurs, the resulting MessageAck can overlap with the async store of the message in the KahaDB. When this occurs, the message is never persisted. Additionally, the resultant {{CancellationException}} is ignored in o.a.a.broker.region.Queue:796. The steps: # a StoreQueueTask is created to add a message X. this is put on the async task queue # meanwhile this message is dispatched via a prefetch subscription to a transacted consumer. # the transacted consumer calls session.rollback # this leads to acknowledgement of the dispatched messages # as a result destination.removeAsyncMessage is called # if the original add has not yet executed then it will be cancelled leading to the message never being persisted! (occurs at KahaDBStore:401) I have not been able to reproduce this in a small activemq-only test. But I can reproduce it in my environment. *Proposed Solutions* I think the issue lies either with: * the check at KahaDBTransactionStore:477, should it be calling {{theStore.isConcurrentStoreAndDispatchQueues()}} as *Workaround* * turn off caching for the destination (see [dest policies|http://activemq.apache.org/per-destination-policies.html]). this will cause messages to be added to the synchronously so they will not be subject to the async cancellation > KahaDBTransactionStore.removeAyncMessage may cancel addMessage when in transaction leading to unpersisted messages > ------------------------------------------------------------------------------------------------------------------ > > Key: AMQ-4157 > URL: https://issues.apache.org/jira/browse/AMQ-4157 > Project: ActiveMQ > Issue Type: Bug > Components: Message Store > Affects Versions: 5.7.0 > Environment: linux 64-bit, kahadb, persisted messages, cached dest, transacted > Reporter: Martin Serrano > Priority: Blocker > > This was very difficult to track down. It rarely occurs because a certain set of events must be occurring to trigger the bug. I have marked it a Blocker because when it does occur, it is silent and leads to a message not being persisted in the MessageStore. > *Description* > The crux of the bug is that when a rollback on a session occurs, the resulting MessageAck can overlap with the async store of the message in the KahaDB. When this occurs, the message is never persisted. Additionally, the resultant {{CancellationException}} is ignored in o.a.a.broker.region.Queue:796. The steps: > # a StoreQueueTask is created to add a message X. this is put on the async task queue > # meanwhile this message is dispatched via a prefetch subscription to a transacted consumer. > # the transacted consumer calls session.rollback > # this leads to acknowledgement of the dispatched messages > # as a result destination.removeAsyncMessage is called > # if the original add has not yet executed then it will be cancelled leading to the message never being persisted! (occurs at KahaDBStore:401) > # the Queue.send method uses the result future to make sure the persist happens in the store, but it ignores cancellation, so this can lead execution control to return to the sender when no persistence has occurred without an error. > I have not been able to reproduce this in a small activemq-only test. But I can reproduce it in my environment. > *Proposed Solutions* > I'm really unsure of the solution here. Should {{KahaDBStore.removeAsyncMessage}} (line 393) check the context and only cancel tasks if it is not in a transaction context? But what would that mean in the log? Would there be a removeMessage prior to the addMessage? > *Workaround* > * turn off caching for the destination (see [dest policies|http://activemq.apache.org/per-destination-policies.html]). this will cause messages to be added to the synchronously so they will not be subject to the async cancellation -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira