From users-return-50577-archive-asf-public=cust-asf.ponee.io@activemq.apache.org Sun Oct 14 18:41:02 2018 Return-Path: X-Original-To: archive-asf-public@cust-asf.ponee.io Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by mx-eu-01.ponee.io (Postfix) with SMTP id E0E41180670 for ; Sun, 14 Oct 2018 18:41:01 +0200 (CEST) Received: (qmail 97940 invoked by uid 500); 14 Oct 2018 16:41:00 -0000 Mailing-List: contact users-help@activemq.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@activemq.apache.org Delivered-To: mailing list users@activemq.apache.org Received: (qmail 97928 invoked by uid 99); 14 Oct 2018 16:41:00 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd4-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Sun, 14 Oct 2018 16:41:00 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd4-us-west.apache.org (ASF Mail Server at spamd4-us-west.apache.org) with ESMTP id A5D11C0158 for ; Sun, 14 Oct 2018 16:40:59 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd4-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 2.869 X-Spam-Level: ** X-Spam-Status: No, score=2.869 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=2, KAM_LIVE=1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001, T_DKIMWL_WL_MED=-0.01] autolearn=disabled Authentication-Results: spamd4-us-west.apache.org (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd4-us-west.apache.org [10.40.0.11]) (amavisd-new, port 10024) with ESMTP id KHppxOQ-I9_r for ; Sun, 14 Oct 2018 16:40:56 +0000 (UTC) Received: from mail-pf1-f181.google.com (mail-pf1-f181.google.com [209.85.210.181]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 6B1705F1D4 for ; Sun, 14 Oct 2018 16:40:55 +0000 (UTC) Received: by mail-pf1-f181.google.com with SMTP id l81-v6so8478265pfg.3 for ; Sun, 14 Oct 2018 09:40:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=MgAwOAZ8QU90hFD9dxATpidpP56u7DNo68NWMxhWziE=; b=DPsTQ2LkmkAa+23jflho66RCXUDDjL8bRQCNcfKxW/2+wpkReeuwxQCeP8oP20IA2n 7i55fB6fcYWlt8kuIQvBwIqzBYSIRKNuvim6Izt9nLv6qye7HJsUhbO5iD887ed6uV/9 tWeU4YvceJycOrr8fmnkOZOp7EvDDml30B75XIqnlUx4DGqCuH6MhZ4oN7mKJIMrlk97 IU9CxETjNtVQFHD25unwHtFtfcln13295/CRVHeUcRNukyCzai+tNY+UK5ix4lL10DKx wu52J6IvpVo06osBgPvkl9asYB8cqSEvx6xJ1/Up8XKdvFDs6lE8ewhtgdZgtiiu3tWe LJrg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=MgAwOAZ8QU90hFD9dxATpidpP56u7DNo68NWMxhWziE=; b=F9bZEgyelp4pE6drcDiklXh1XoTWSqI+Qh1pO/N8uRVCZQhnvuedfKein0DWZUA0LE Y/b5XT5kk9i0Ea3kC87zJTMNsEhOBSnjAEQSpvNmSDNi6jvmw+GpL91KyRf4bB+RSZG5 FH+NfPeyTfVHFfmi+e3SM8AChEKAED/gmHaHM2OzB/8ZpM1GWaWeKWBvYi9tCWtw6bnj rjTHnQc1qsxPYLSxptrHbvufUIqtcMP7t1Kri++mEY++oAXAQAEo9y1dXvKJixGcrzSG t49o/nVKCNmj3uB89F6Gm3UZfjWuGguovgYi1ERN3gEUkCARM5g/VyJkgg+joMg0heaT ih5Q== X-Gm-Message-State: ABuFfogo7wIKo4Yp+mvM2eiGvzvPpbe4RiaR+Fe8IxJXBpyp+L6Y3xUP ww7sIol0H4d21TIZEsDL+5ahjKqXmJtorTyjWR0TPw== X-Google-Smtp-Source: ACcGV60t0SWGhiJJoKxGdm1TUjYX6D2bseJmjPIOjjJXf4mL+7V628EGrTkkNfoHG8M7hYL8L58Pd9MTQCFEwhztkKc= X-Received: by 2002:a63:d52:: with SMTP id 18-v6mr13109056pgn.107.1539535247831; Sun, 14 Oct 2018 09:40:47 -0700 (PDT) MIME-Version: 1.0 References: <60d63486-8bc4-46c6-6119-8b2eec2f5400@gmail.com> <1c5e093d-a030-cfbc-e48f-0ce9d981ec06@gmail.com> In-Reply-To: From: Dan Langford Date: Sun, 14 Oct 2018 10:40:36 -0600 Message-ID: Subject: Re: [Artemis 2.6.2] AMQP messages going to expiry queue, redelivery off that queue is failing To: users@activemq.apache.org Content-Type: multipart/alternative; boundary="00000000000090f832057832fb16" --00000000000090f832057832fb16 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Just for the record: even though we could easily recreate this problem using transactions, and our initial =E2=80=9Cworkaround=E2=80=9D involved n= ot using transactions, we have recently seen this issue when no transactions where involved. Reading through the Jira description it looks like this is not specifically about transactions so I am hopeful that the fix will address even our current issues. Since our users are now focusing on this again I will probably build from master to test the fix On Tue, Sep 11, 2018 at 4:46 PM Dan Langford wrote: > Thank you guys so much! > On Tue, Sep 11, 2018 at 2:50 PM Timothy Bish wrote: > >> On 08/30/2018 07:10 PM, Dan Langford wrote: >> > thanks for looking into this. what is the proper way to force for >> testing a >> > redelivery that goes back to the broker without transactions? its >> probably >> > like killing the session or connection. that would be if we wanted to >> test >> > if non-transacted redeliveries were getting corrupted >> >> The issue has been identified and fixed, should appear in the next >> release. >> https://issues.apache.org/jira/browse/ARTEMIS-2082 >> >> > >> > we have a lot of people using spring-jms and as i look in the spring >> code... >> > >> https://github.com/spring-projects/spring-framework/blob/master/spring-j= ms/src/main/java/org/springframework/jms/listener/AbstractMessageListenerCo= ntainer.java >> > looking at doExecuteListener and rollbackOnExceptionIfNecessary >> > it looks like if its transacted they try to rollback() and if its >> > client_ack they try recover(). they probably dont handle auto_ack >> because >> > that was acked immediately? >> > >> > speaking of this what is an appropriate way to NACK a message when usi= ng >> > Client Acknowledgemnt and a JMS messageHandler? is it simply to reach >> the >> > end of the method execution without having called message.acknowledge(= ) >> or >> > would it be appropriate to throw a RuntimeException (since i cannot >> throw a >> > checked Exception out of an implementation of >> javax.jms.MessageListener) ? >> > >> > On Thu, Aug 30, 2018 at 8:38 AM Robbie Gemmell < >> robbie.gemmell@gmail.com> >> > wrote: >> > >> >> Tim and I had an initial look at this, and can see generally where th= e >> >> broker is internally corrupting things on send, though not yet the >> >> full picture how it gets there or what to do about it. The expiration >> >> is likely to be key, one difference with the non-transacted case is >> >> actually going to be because its using recover() which the client >> >> performs locally. >> >> >> >> Robbie >> >> >> >> On Wed, 29 Aug 2018 at 06:23, Dan Langford >> wrote: >> >>> ok i wrote 3 test files. I don't know the best way to get them to yo= u >> >>> easily. hopefully a Gist is ok. >> >>> https://gist.github.com/danlangford/071e738225ec0c68dd470816b977499b >> >>> >> >>> you can copy those 3 files straight to >> >>> >> >> >> ./tests/integration-tests/src/test/java/org/apache/activemq/artemis/test= s/integration/amqp >> >>> The test JMSTransactedRedeliveryTest::testAMQPProducerAMQPConsumer >> proves >> >>> that a transacted client can .rollback() a handful of times and stil= l >> be >> >>> able to consume the redelivered message later on. >> >>> >> >>> The test JMSTransactedRedeliveryBugTest::testAMQPProducerAMQPConsume= r >> >> shows >> >>> that if a message had been expired and now a transacted client is >> >>> attempting to consume it the client only has 2 chances before the >> broker >> >>> starts sending the message in a way that will not parse correctly >> >>> >> >>> The test >> JMSNonTransactedRedeliveryBugTest::testAMQPProducerAMQPConsumer >> >> shows >> >>> that if a message had been expired a non-transacted client has no >> >> troubles >> >>> reliably accessing the redelivered message from broker >> >>> >> >>> >> >>> as you can tell i am mostly concerned about AMQP->AMQP for my use >> case. >> >>> some of those other combos are failing some of these tests in other >> ways. >> >>> naturally you can address those as you see fit but for my client the >> >>> AMQP->AMQP is a roadblocker. >> >>> >> >>> >> >>> let me know if you can determine why the broker is sending an extra >> null >> >>> character in the payload on the third time the messages attempts >> >> delivery. >> >>> maybe we are doing something incorrectly. >> >>> >> >>> >> >>> This has been more of an issue than i thought due to the fact that >> Spring >> >>> default to enabling transactions. in all of my initial tests i >> couldn't >> >>> reproduce it because i prefer the straight simplified jms api from 2= .0 >> >> and >> >>> that defaults to sessions not being transacted. that being said near= ly >> >> all >> >>> of my clients prefer using Spring Boot autoconfigurer and other spri= ng >> >>> pieces which happen to default to transacted sessions. i can now ha= ve >> >> some >> >>> of them workaround but others of them are requiring the transaction. >> >>> >> >>> >> >>> also as a reminder and for context here is a link to the initial >> >>> conversation i had with the Qpid Jms Client devs who pointed out to = me >> >> the >> >>> erroneous null character in the message transfer from the broker: >> >>> https://lists.apache.org/ >> >>> thread.html/b1fd9c09a1f66f5529601a8651fbb96585c011b22bbd84e07c4f23b1= @ >> % >> >> 3Cusers.qpid.apache.org%3E >> >>> >> >>> thank you so much for your time >> >>> >> >>> On Tue, Aug 14, 2018 at 1:19 PM Timothy Bish >> >> wrote: >> >>>> On 08/13/2018 07:12 PM, Dan Langford wrote: >> >>>>> some of my users are attempting a pattern to deduplicate messages >> >> based >> >>>> on >> >>>>> a time window instead of a fixed amount of space (a duplicate id >> >> cache) >> >>>>> so far the concept has been working very well. So they send their >> >> AMQP >> >>>>> messages (qpid-jms-client) into a Last Value Queue with an >> >> appropriate >> >>>>> identifier in the _AMQ_LVQ_NAME. They also set a TimeToLive on the >> >>>> message >> >>>>> that is essentially the lag they will allow as they want to wait f= or >> >>>>> possible duplicates. If any duplicates come in the Last Value Queu= e >> >>>>> behavior is replacing the older message with the newer message unt= il >> >> the >> >>>>> expiration. expired messages are delivered to the preconfigured >> >> expiry >> >>>>> queue where their application is listening. This is not perfect bu= t >> >> its >> >>>> not >> >>>>> intended to be. Its just intended to reduce additional unnecessary >> >>>>> processing and they understand this is not a guarantee. It really >> >> helps >> >>>>> with a system that produces messages in a way that has flurries of >> >>>>> "notifications" about the same assetID over and over again. >> >>>>> >> >>>>> BUT where we are seeing is a problem is when we are consuming from >> >> the >> >>>>> queue used to hold expired messages and we toss some exception and >> >> the >> >>>>> message needs to be redelivered. the first time or two the message >> is >> >>>>> redelivered it is delivered OK. But when the JMSXDeliveryCount is >> >> about 3 >> >>>>> or 4 (we use redelivery delay and multipliers to spread these out) >> >> our >> >>>>> qpid-jms-client stops being able to read the messages. >> >>>>> >> >>>>> we were only able to reproduce this when an AMQP message expired >> >> onto the >> >>>>> queue. (expired from a LVQ in case that is relevant). if we place >> the >> >>>>> message directly on a queue and test different exception and >> >> redelivery >> >>>>> scenarios we cannot reproduce this behavior. >> >>>>> >> >>>>> i enable the qpid-jms-client frame logging (via env variable >> >>>>> PN_TRACE_FRM=3Dtrue) and i saw that in the situation when the clie= nt >> >> code >> >>>>> cannot access the payload, even though the broker WAS still sendin= g >> >> the >> >>>>> payload. so i thought it was some odd issue with the client. The >> >> Apache >> >>>>> Qpid team responded that the issue seems to be that the broker >> >> starts to >> >>>>> send some ill formed payloads in this scenario. i dont want to >> >> repeat the >> >>>>> stack traces and their response, you can read those here >> >>>>> >> >>>>> >> >> >> https://lists.apache.org/thread.html/b1fd9c09a1f66f5529601a8651fbb96585c= 011b22bbd84e07c4f23b1@%3Cusers.qpid.apache.org%3E >> >>>>> would it be helpful if i tested that this happens if there is not = a >> >> LVQ >> >>>>> involved? i could have a message in a non-LVQ expire to another >> >> queue and >> >>>>> see if redeliveries over their get messed up after a few attempts. >> >> For >> >>>> the >> >>>>> record this is AMQP for producing and consuming. i do notice the >> >> messages >> >>>>> waiting in the expiry queue have much more headers messages sent >> >> directly >> >>>>> to a queue from client code. they seem to be headers full of >> >> information >> >>>>> about the message as it left the previous queue. I tried to send a >> >>>> message >> >>>>> directly to the expiry queue with all these headers to determine i= f >> >> it >> >>>> was >> >>>>> the existence of one of these specifically that trigger the >> malformed >> >>>> frame >> >>>>> but was not able to fully set all those headers. the JMSDeliverCou= nt >> >>>> (type >> >>>>> Long) was the one that the client would not let me set and as a >> >> result i >> >>>>> could not test. for clarity thought i dont know that the issue >> >> exists due >> >>>>> to a header that is just what i saw as a difference between messag= es >> >> be >> >>>>> delivered to the queue by client code versus messages expiring fro= m >> >> one >> >>>>> queue to another. >> >>>>> >> >>>>> please look over the linked thread on the qpid list and let me kno= w >> >> if >> >>>> you >> >>>>> know why a message transfer fram would become malformed after a fe= w >> >>>> failed >> >>>>> deliveries only if the message expired onto the current queue. >> >>>>> >> >>>>> thanks so much >> >>>>> >> >>>> A great place to start is to create a unit test that reproduces the >> >>>> issue. You can look at the Artemis unit tests for AMQP to get some >> >>>> inspiration on how to set one up. Then try and create the smallest >> >>>> possible test that can reproduce the issue to make it easier to >> narrow >> >>>> in on where the issue might be. >> >>>> >> >>>> The AMQP tests in Artemis are located here: >> >>>> >> >>>> >> >>>> >> >> >> https://github.com/apache/activemq-artemis/tree/master/tests/integration= -tests/src/test/java/org/apache/activemq/artemis/tests/integration/amqp >> >>>> >> >>>> >> >>>> -- >> >>>> Tim Bish >> >>>> >> >>>> >> >> -- >> Tim Bish >> twitter: @tabish121 >> blog: http://timbish.blogspot.com/ >> >> --00000000000090f832057832fb16--