Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id 9CC15200B91 for ; Thu, 29 Sep 2016 16:48:14 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id 9B857160AE3; Thu, 29 Sep 2016 14:48:14 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id 86DBC160AD7 for ; Thu, 29 Sep 2016 16:48:13 +0200 (CEST) Received: (qmail 2202 invoked by uid 500); 29 Sep 2016 14:48:12 -0000 Mailing-List: contact users-help@qpid.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@qpid.apache.org Delivered-To: mailing list users@qpid.apache.org Received: (qmail 2191 invoked by uid 99); 29 Sep 2016 14:48:12 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd3-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 29 Sep 2016 14:48:12 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd3-us-west.apache.org (ASF Mail Server at spamd3-us-west.apache.org) with ESMTP id D1909180690 for ; Thu, 29 Sep 2016 14:48:11 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd3-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: 1.279 X-Spam-Level: * X-Spam-Status: No, score=1.279 tagged_above=-999 required=6.31 tests=[HTML_MESSAGE=2, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_PASS=-0.001] autolearn=disabled Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd3-us-west.apache.org [10.40.0.10]) (amavisd-new, port 10024) with ESMTP id H53esWq9rb8j for ; Thu, 29 Sep 2016 14:48:07 +0000 (UTC) Received: from SNT004-OMC4S46.hotmail.com (snt004-omc4s46.hotmail.com [65.54.51.97]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id 49A2A5F253 for ; Thu, 29 Sep 2016 14:48:06 +0000 (UTC) Received: from EUR03-DB5-obe.outbound.protection.outlook.com ([65.55.90.201]) by SNT004-OMC4S46.hotmail.com over TLS secured channel with Microsoft SMTPSVC(7.5.7601.23008); Thu, 29 Sep 2016 07:47:59 -0700 Received: from AM5EUR03FT026.eop-EUR03.prod.protection.outlook.com (10.152.16.60) by AM5EUR03HT247.eop-EUR03.prod.protection.outlook.com (10.152.16.219) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.629.5; Thu, 29 Sep 2016 14:47:47 +0000 Received: from VI1PR0901MB0893.eurprd09.prod.outlook.com (10.152.16.55) by AM5EUR03FT026.mail.protection.outlook.com (10.152.16.155) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.629.5 via Frontend Transport; Thu, 29 Sep 2016 14:47:47 +0000 Received: from VI1PR0901MB0893.eurprd09.prod.outlook.com ([10.167.199.9]) by VI1PR0901MB0893.eurprd09.prod.outlook.com ([10.167.199.9]) with mapi id 15.01.0649.016; Thu, 29 Sep 2016 14:47:47 +0000 From: Adel Boutros To: "users@qpid.apache.org" Subject: Re: Testing failover on dispatcher/java-broker cluster Thread-Topic: Testing failover on dispatcher/java-broker cluster Thread-Index: AQHSFkaEPSCX1rdMpkuzK4vGdmvBj6CQkOaAgAAA0wCAAAF2LQ== Date: Thu, 29 Sep 2016 14:47:47 +0000 Message-ID: References: <534e0042-81da-cf2f-eeb5-8adfac34d64a@redhat.com> <8a4e6473-749b-08d2-7b6a-e415523487f0@redhat.com> , In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=softfail (sender IP is 10.152.16.55) smtp.mailfrom=live.com; qpid.apache.org; dkim=none (message not signed) header.d=none;qpid.apache.org; dmarc=none action=none header.from=live.com; received-spf: SoftFail (protection.outlook.com: domain of transitioning live.com discourages use of 10.152.16.55 as permitted sender) x-tmn: [cShmoaamPlINtGjDm6LSlEqqrCgQVblq] x-eopattributedmessage: 0 x-microsoft-exchange-diagnostics: 1;AM5EUR03HT247;6:/2VekOHZcg7FBHwI+46sVIPmMg2b12NLTS45xBwHWZkzk8IPViSMYW34fkHaTq7yUcZU8KYj2ruDYFBivE9AtHI7QtoFBb0J0uQABCjJyOzCWy4KiGkGl1znlonJRJvcD8OI4EthIPG09JbHz9TmYF4gJ1VeJ7Q7nW94yQH57CjfAXubqUTV8/WvfGbD2bra9LjbMXJbCLDd2ERHQkH+WkNQU2rQrtxNFv1HY1AIO/bVQShaydnF3haabIJA1q0qZ+BYytSfN0TwpZaxpNyNO0cgy5h1z9uNuX+jpMR+Pig=;5:BHMqefdy2qsiZPOJ3P3BhsLnVZ51Rsd0cOkwKniz3t8s+CCTzQ0fbj+UlM3iiBKCpD5GLd4SliFAalSzmDjbEuYGbNW0HH9udkrJsfphIfVzeZNsLz8hifkaERcqRUwuJrHr7fIAXA/69HQDs1l+bQ==;24:0BJGOvvQZ2pnXhVgaqPb9vDOg89dmoAwYl8lZMVOO7DHOmNL1VZ/4sdEmMzfPQyUkUOHiBf9dZORogaSnBRkBVyNhIGASWY2v29Egbw2MWo=;7:KcUhQ9uG7DLA/WLZtpkdKQWk+uIvToc4bmvJqL6R81/ijFO7OKh0kbyddKeGhxNmFY7JRZgPJ+UC49E5ArJiOtJOr4KpUzjW4dcfmndW+QePqzwPsj/XqJJProlteXSrCGfWCajiAm8ZuwUSbuTPE3owo37jPkKB10yNCm6UdXoDJQhOu5cuI0kUSGM9bNsqf7UBiBXpMe7MuFJE1yUwYrxB/x04lcL7SgKBylSUz7cH1bW+8P7exb7iCqZ86NCXuqs2ZVEZNmEtgDpjY11yCN5f/OAVen1ND2y5Jgzr7A3IewF2swEtH3AOHxlZkE9K x-forefront-antispam-report: EFV:NLI;SFV:NSPM;SFS:(10019020)(98900003);DIR:OUT;SFP:1102;SCL:1;SRVR:AM5EUR03HT247;H:VI1PR0901MB0893.eurprd09.prod.outlook.com;FPR:;SPF:None;LANG:en; x-ms-office365-filtering-correlation-id: 6c9075e8-811d-4b57-82d3-08d3e877944f x-microsoft-antispam: UriScan:;BCL:0;PCL:0;RULEID:(1601124038)(1603103081)(1601125047);SRVR:AM5EUR03HT247; x-exchange-antispam-report-cfa-test: BCL:0;PCL:0;RULEID:(432015012)(82015046);SRVR:AM5EUR03HT247;BCL:0;PCL:0;RULEID:;SRVR:AM5EUR03HT247; x-forefront-prvs: 00808B16F3 spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: multipart/alternative; boundary="_000_VI1PR0901MB089397866AA9B0512E0194EECDCE0VI1PR0901MB0893_" MIME-Version: 1.0 X-OriginatorOrg: live.com X-MS-Exchange-CrossTenant-originalarrivaltime: 29 Sep 2016 14:47:47.4811 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Internet X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM5EUR03HT247 X-OriginalArrivalTime: 29 Sep 2016 14:47:59.0480 (UTC) FILETIME=[78E7B780:01D21A60] archived-at: Thu, 29 Sep 2016 14:48:14 -0000 --_000_VI1PR0901MB089397866AA9B0512E0194EECDCE0VI1PR0901MB0893_ Content-Type: text/plain; charset="Windows-1252" Content-Transfer-Encoding: quoted-printable They seem fair enough and quite related. As a side note, I have a bug with the dispatch router 0.6.1 but I haven't s= ubmitted it yet because I haven't reduced the test case yet. In resume, when I connect 2 dispatchers (inter-router) and then delete the = connector/listener of "inter-router". If I delete and recreate a mobile add= ress which has received a message on one of the dispatchers, the stats of t= he "in" and "out" do not reset to 0 when doing "qdstat -a" but they remain = at the old values. However they reset correctly on the other router. Have you encountered something similar? Once I have a reduced test case, I = will post it in a different thread of course. Regards, Adel ________________________________ From: Ted Ross Sent: Thursday, September 29, 2016 4:38:26 PM To: users@qpid.apache.org Subject: Re: Testing failover on dispatcher/java-broker cluster Sorry, those Jira numbers and descriptions are mismatched. Here's the correct list: - DISPATCH-496 - Activation of an autolink does not result in issuing credit to a blocked sender - DISPATCH-505 - Eventual loss of credit on inter-router control links when the topology changes - DISPATCH-523 - Topology changes can cause in-flight deliveries to be stuck in the ingress router On 09/29/2016 10:35 AM, Ted Ross wrote: > > On 09/24/2016 05:32 AM, Adel Boutros wrote: >> We are indeed in favor of a minor release as long as the latest >> version is still 0.6.x and we are willing to re-launch our tests and >> give feedback on the release candidate once provided (It shouldn't >> take us more than a day to compile and test). >> Do you have a list of fixes in mind? > > I've identified three fixes that look like good candidates for 0.6.2: > > - DISPATCH-496 - Topology changes can cause in-flight deliveries to > be stuck in the ingress router > - DISPATCH-505 - Eventual loss of credit on inter-router control > links when the topology changes > - DISPATCH-523 - Activation of an autolink does not result in issuing > credit to a blocked sender > > These are all stability-related issues. > > Thoughts? > > -Ted > >> Regards,Adel >> >>> Subject: Re: Testing failover on dispatcher/java-broker cluster >>> To: users@qpid.apache.org >>> From: tross@redhat.com >>> Date: Fri, 23 Sep 2016 17:23:57 -0400 >>> >>> Hi Adel, >>> >>> A minor release is always possible. It's up to us, the community, to >>> decide whether and when to produce one. I'm in favor of releasing an >>> 0.6.2 with some small backports to fix bugs for users that want to stay >>> on Proton 0.12. >>> >>> -Ted >>> >>> On 09/23/2016 09:44 AM, Adel Boutros wrote: >>>> Hello Ted, >>>> Did you happen to have the time to check if a minor release is >>>> possible? >>>> Regards,Adel >>>> >>>>> From: adelboutros@live.com >>>>> To: users@qpid.apache.org >>>>> Subject: RE: Testing failover on dispatcher/java-broker cluster >>>>> Date: Tue, 20 Sep 2016 15:13:03 +0200 >>>>> >>>>> Hello Ted, >>>>> >>>>> I confirm the fix solved the issue. >>>>> >>>>> Would it be possible to do a 0.6.2 release? We cannot compile newer >>>>> versions of Proton (We currently use 0.12.2) due to lack of >>>>> resources from our side and we really need this fix for our tests. >>>>> >>>>> Regards, >>>>> Adel >>>>> >>>>>> Subject: Re: Testing failover on dispatcher/java-broker cluster >>>>>> To: users@qpid.apache.org >>>>>> From: tross@redhat.com >>>>>> Date: Mon, 19 Sep 2016 12:18:23 -0400 >>>>>> >>>>>> Hi Adel, >>>>>> >>>>>> It's a one-liner and it applies cleanly to the 0.6.x branch. >>>>>> >>>>>> https://git-wip-us.apache.org/repos/asf?p=3Dqpid-dispatch.git;h=3D41= b7407 >>>>>> >>>>>> -Ted >>>>>> >>>>>> >>>>>> On 09/19/2016 11:41 AM, Adel Boutros wrote: >>>>>>> Hello Ted, >>>>>>> >>>>>>> Antoine is on vacation so I will be taking over this task. >>>>>>> >>>>>>> Does this fix have any dependencies? We would like to apply it on >>>>>>> 0.6.1 without other fixes because it seems the master branch >>>>>>> requires proton 0.13.0 minimum whereas we have currently 0.12.2 >>>>>>> and we cannot upgrade at the time being. >>>>>>> >>>>>>> Regards, >>>>>>> Adel >>>>>>> >>>>>>>> Subject: Re: Testing failover on dispatcher/java-broker cluster >>>>>>>> To: users@qpid.apache.org >>>>>>>> From: tross@redhat.com >>>>>>>> Date: Fri, 16 Sep 2016 16:53:05 -0400 >>>>>>>> >>>>>>>> Antoine, >>>>>>>> >>>>>>>> I think I know what that problem is. I belileve you've stumbled >>>>>>>> upon >>>>>>>> this issue: >>>>>>>> >>>>>>>> https://issues.apache.org/jira/browse/DISPATCH-496 >>>>>>>> >>>>>>>> Your second delivery, the one resulting in a timeout, is causing >>>>>>>> the >>>>>>>> inbound link to be blocked (i.e. it has undelivered messages). >>>>>>>> When the >>>>>>>> broker reattaches, the blocked links are supposed to become >>>>>>>> unblocked >>>>>>>> but they don't in the case of auto-links. >>>>>>>> >>>>>>>> This has been fixed on the master branch if you'd like to try >>>>>>>> applying >>>>>>>> the patch. >>>>>>>> >>>>>>>> -Ted >>>>>>>> >>>>>>>> On 09/15/2016 04:56 AM, Antoine Chevin wrote: >>>>>>>>> Hi Ted, >>>>>>>>> >>>>>>>>> You=92re right, the connection close looked strange before >>>>>>>>> stopping of the >>>>>>>>> broker. I manually added the annotation (# stopping the broker) >>>>>>>>> and was >>>>>>>>> wrong about the position of this one. I replayed the test and the >>>>>>>>> connection close happens *after* the broker stop. I assume it >>>>>>>>> is the broker >>>>>>>>> that initiates it. >>>>>>>>> >>>>>>>>> I found something interesting. In my test, I always sent a >>>>>>>>> message when the >>>>>>>>> broker is down, expecting to get a JmsSendTimedOutException >>>>>>>>> (waiting for >>>>>>>>> the disposition frame). I assumed this was harmless. But it >>>>>>>>> turns out this >>>>>>>>> is not. When I don=92t do that, I can send a message after the >>>>>>>>> broker >>>>>>>>> restart. So to sum up the experiment I did: >>>>>>>>> >>>>>>>>> * I use Wireshark between the JMS client and the dispatcher. * >>>>>>>>> >>>>>>>>> 1) Using JMS I establish a connection to the dispatcher >>>>>>>>> and create a >>>>>>>>> message producer (Wireshark: connection open -> attach) >>>>>>>>> 2) I=92m able to send a message to the broker through the >>>>>>>>> dispatcher ( >>>>>>>>> Wireshark: transfer -> disposition) >>>>>>>>> 3) I stop the broker >>>>>>>>> 4) With the same link, I send a message and I get a >>>>>>>>> JmsSendTimedOutException (waiting for the disposition frame) >>>>>>>>> (Wireshark: >>>>>>>>> transfer) >>>>>>>>> 5) I restart the broker >>>>>>>>> 6) With the same link, I try to send a message and I get a >>>>>>>>> JmsSendTimedOutException for the same reason (waiting for the >>>>>>>>> disposition >>>>>>>>> frame) (Wireshark: transfer) >>>>>>>>> >>>>>>>>> If I skip step (4), I cannot reproduce step (6) and my messages >>>>>>>>> arrive >>>>>>>>> (Wireshark: transfer -> disposition) to the restarted broker. >>>>>>>>> >>>>>>>>> I hope it makes it clearer for you. Sorry for my rookie >>>>>>>>> mistakes :-). >>>>>>>>> >>>>>>>>> Note: My colleague and I ran a small experiment to identify if >>>>>>>>> the problem >>>>>>>>> comes from JMS or the AMQP protocol. He changed the code of the >>>>>>>>> java broker >>>>>>>>> to not send the disposition frame one time out of two. >>>>>>>>> >>>>>>>>> We got these results: >>>>>>>>> >>>>>>>>> * I use Wireshark between the JMS client and the patched broker. = * >>>>>>>>> >>>>>>>>> 1) Using JMS I establish a connection to the patched broker and >>>>>>>>> create a >>>>>>>>> message producer (Wireshark: connection open -> attach) >>>>>>>>> 2) I send a message to the broker and it replies with the >>>>>>>>> disposition >>>>>>>>> frame (Wireshark: transfer -> disposition) >>>>>>>>> 3) I send a message to the broker which drops the disposition >>>>>>>>> frame. I get >>>>>>>>> a send timeout in JMS (Wireshark: transfer) >>>>>>>>> 2) I send a message to the broker and it replies with the >>>>>>>>> disposition frame >>>>>>>>> (Wireshark: transfer -> disposition). It works fine. >>>>>>>>> >>>>>>>>> We assume that there is something going on in the dispatcher. >>>>>>>>> >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> Antoine >>>>>>>>> >>>>>>>> >>>>>>>> ------------------------------------------------------------------= --- >>>>>>>> >>>>>>>> To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org >>>>>>>> For additional commands, e-mail: users-help@qpid.apache.org >>>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> --------------------------------------------------------------------= - >>>>>> To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org >>>>>> For additional commands, e-mail: users-help@qpid.apache.org >>>>>> >>>>> >>>> >>>> >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org >>> For additional commands, e-mail: users-help@qpid.apache.org >>> >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org > For additional commands, e-mail: users-help@qpid.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org For additional commands, e-mail: users-help@qpid.apache.org --_000_VI1PR0901MB089397866AA9B0512E0194EECDCE0VI1PR0901MB0893_--