qpid-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Ted Ross <tr...@redhat.com>
Subject Re: Testing failover on dispatcher/java-broker cluster
Date Thu, 29 Sep 2016 14:38:26 GMT
Sorry, those Jira numbers and descriptions are mismatched.  Here's the 
correct list:

    - DISPATCH-496 - Activation of an autolink does not result in issuing
                     credit to a blocked sender
    - DISPATCH-505 - Eventual loss of credit on inter-router control
                     links when the topology changes
    - DISPATCH-523 - Topology changes can cause in-flight deliveries to
                     be stuck in the ingress router


On 09/29/2016 10:35 AM, Ted Ross wrote:
>
> On 09/24/2016 05:32 AM, Adel Boutros wrote:
>> We are indeed in favor of a minor release as long as the latest
>> version is still 0.6.x and we are willing to re-launch our tests and
>> give feedback on the release candidate once provided (It shouldn't
>> take us more than a day to compile and test).
>> Do you have a list of fixes in mind?
>
> I've identified three fixes that look like good candidates for 0.6.2:
>
>   - DISPATCH-496 - Topology changes can cause in-flight deliveries to
>                    be stuck in the ingress router
>   - DISPATCH-505 - Eventual loss of credit on inter-router control
>                    links when the topology changes
>   - DISPATCH-523 - Activation of an autolink does not result in issuing
>                    credit to a blocked sender
>
> These are all stability-related issues.
>
> Thoughts?
>
> -Ted
>
>> Regards,Adel
>>
>>> Subject: Re: Testing failover on dispatcher/java-broker cluster
>>> To: users@qpid.apache.org
>>> From: tross@redhat.com
>>> Date: Fri, 23 Sep 2016 17:23:57 -0400
>>>
>>> Hi Adel,
>>>
>>> A minor release is always possible.  It's up to us, the community, to
>>> decide whether and when to produce one.  I'm in favor of releasing an
>>> 0.6.2 with some small backports to fix bugs for users that want to stay
>>> on Proton 0.12.
>>>
>>> -Ted
>>>
>>> On 09/23/2016 09:44 AM, Adel Boutros wrote:
>>>> Hello Ted,
>>>> Did you happen to have the time to check if a minor release is
>>>> possible?
>>>> Regards,Adel
>>>>
>>>>> From: adelboutros@live.com
>>>>> To: users@qpid.apache.org
>>>>> Subject: RE: Testing failover on dispatcher/java-broker cluster
>>>>> Date: Tue, 20 Sep 2016 15:13:03 +0200
>>>>>
>>>>> Hello Ted,
>>>>>
>>>>> I confirm the fix solved the issue.
>>>>>
>>>>> Would it be possible to do a 0.6.2 release? We cannot compile newer
>>>>> versions of Proton (We currently use 0.12.2) due to lack of
>>>>> resources from our side and we really need this fix for our tests.
>>>>>
>>>>> Regards,
>>>>> Adel
>>>>>
>>>>>> Subject: Re: Testing failover on dispatcher/java-broker cluster
>>>>>> To: users@qpid.apache.org
>>>>>> From: tross@redhat.com
>>>>>> Date: Mon, 19 Sep 2016 12:18:23 -0400
>>>>>>
>>>>>> Hi Adel,
>>>>>>
>>>>>> It's a one-liner and it applies cleanly to the 0.6.x branch.
>>>>>>
>>>>>> https://git-wip-us.apache.org/repos/asf?p=qpid-dispatch.git;h=41b7407
>>>>>>
>>>>>> -Ted
>>>>>>
>>>>>>
>>>>>> On 09/19/2016 11:41 AM, Adel Boutros wrote:
>>>>>>> Hello Ted,
>>>>>>>
>>>>>>> Antoine is on vacation so I will be taking over this task.
>>>>>>>
>>>>>>> Does this fix have any dependencies? We would like to apply it
on
>>>>>>> 0.6.1 without other fixes because it seems the master branch
>>>>>>> requires proton 0.13.0 minimum whereas we have currently 0.12.2
>>>>>>> and we cannot upgrade at the time being.
>>>>>>>
>>>>>>> Regards,
>>>>>>> Adel
>>>>>>>
>>>>>>>> Subject: Re: Testing failover on dispatcher/java-broker cluster
>>>>>>>> To: users@qpid.apache.org
>>>>>>>> From: tross@redhat.com
>>>>>>>> Date: Fri, 16 Sep 2016 16:53:05 -0400
>>>>>>>>
>>>>>>>> Antoine,
>>>>>>>>
>>>>>>>> I think I know what that problem is.  I belileve you've stumbled
>>>>>>>> upon
>>>>>>>> this issue:
>>>>>>>>
>>>>>>>> https://issues.apache.org/jira/browse/DISPATCH-496
>>>>>>>>
>>>>>>>> Your second delivery, the one resulting in a timeout, is
causing
>>>>>>>> the
>>>>>>>> inbound link to be blocked (i.e. it has undelivered messages).
>>>>>>>> When the
>>>>>>>> broker reattaches, the blocked links are supposed to become
>>>>>>>> unblocked
>>>>>>>> but they don't in the case of auto-links.
>>>>>>>>
>>>>>>>> This has been fixed on the master branch if you'd like to
try
>>>>>>>> applying
>>>>>>>> the patch.
>>>>>>>>
>>>>>>>> -Ted
>>>>>>>>
>>>>>>>> On 09/15/2016 04:56 AM, Antoine Chevin wrote:
>>>>>>>>> Hi Ted,
>>>>>>>>>
>>>>>>>>> You’re right, the connection close looked strange before
>>>>>>>>> stopping of the
>>>>>>>>> broker. I manually added the annotation (# stopping the
broker)
>>>>>>>>> and was
>>>>>>>>> wrong about the position of this one. I replayed the
test and the
>>>>>>>>> connection close happens *after* the broker stop. I assume
it
>>>>>>>>> is the broker
>>>>>>>>> that initiates it.
>>>>>>>>>
>>>>>>>>> I found something interesting. In my test, I always sent
a
>>>>>>>>> message when the
>>>>>>>>> broker is down, expecting to get a JmsSendTimedOutException
>>>>>>>>> (waiting for
>>>>>>>>> the disposition frame). I assumed this was harmless.
But it
>>>>>>>>> turns out this
>>>>>>>>> is not. When I don’t do that, I can send a message
after the
>>>>>>>>> broker
>>>>>>>>> restart. So to sum up the experiment I did:
>>>>>>>>>
>>>>>>>>> * I use Wireshark between the JMS client and the dispatcher.
*
>>>>>>>>>
>>>>>>>>> 1)      Using JMS I establish a connection to the dispatcher
>>>>>>>>> and create a
>>>>>>>>> message producer (Wireshark: connection open -> attach)
>>>>>>>>> 2)      I’m able to send a message to the broker through
the
>>>>>>>>> dispatcher (
>>>>>>>>> Wireshark: transfer -> disposition)
>>>>>>>>> 3)      I stop the broker
>>>>>>>>> 4)      With the same link, I send a message and I get
a
>>>>>>>>> JmsSendTimedOutException (waiting for the disposition
frame)
>>>>>>>>> (Wireshark:
>>>>>>>>> transfer)
>>>>>>>>> 5)      I restart the broker
>>>>>>>>> 6)      With the same link, I try to send a message and
I get a
>>>>>>>>> JmsSendTimedOutException for the same reason (waiting
for the
>>>>>>>>> disposition
>>>>>>>>> frame) (Wireshark: transfer)
>>>>>>>>>
>>>>>>>>> If I skip step (4), I cannot reproduce step (6) and my
messages
>>>>>>>>> arrive
>>>>>>>>> (Wireshark: transfer -> disposition) to the restarted
broker.
>>>>>>>>>
>>>>>>>>> I hope it makes it clearer for you. Sorry for my rookie
>>>>>>>>> mistakes :-).
>>>>>>>>>
>>>>>>>>> Note: My colleague and I ran a small experiment to identify
if
>>>>>>>>> the problem
>>>>>>>>> comes from JMS or the AMQP protocol. He changed the code
of the
>>>>>>>>> java broker
>>>>>>>>> to not send the disposition frame one time out of two.
>>>>>>>>>
>>>>>>>>> We got these results:
>>>>>>>>>
>>>>>>>>> * I use Wireshark between the JMS client and the patched
broker. *
>>>>>>>>>
>>>>>>>>> 1) Using JMS I establish a connection to the patched
broker and
>>>>>>>>> create a
>>>>>>>>> message producer (Wireshark: connection open -> attach)
>>>>>>>>> 2)  I send a message to the broker and it replies with
the
>>>>>>>>> disposition
>>>>>>>>> frame (Wireshark: transfer -> disposition)
>>>>>>>>> 3) I send a message to the broker which drops the disposition
>>>>>>>>> frame. I get
>>>>>>>>> a send timeout in JMS (Wireshark: transfer)
>>>>>>>>> 2)  I send a message to the broker and it replies with
the
>>>>>>>>> disposition frame
>>>>>>>>> (Wireshark: transfer -> disposition). It works fine.
>>>>>>>>>
>>>>>>>>> We assume that there is something going on in the dispatcher.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>> Antoine
>>>>>>>>>
>>>>>>>>
>>>>>>>> ---------------------------------------------------------------------
>>>>>>>>
>>>>>>>> To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
>>>>>>>> For additional commands, e-mail: users-help@qpid.apache.org
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
>>>>>> For additional commands, e-mail: users-help@qpid.apache.org
>>>>>>
>>>>>
>>>>
>>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
>>> For additional commands, e-mail: users-help@qpid.apache.org
>>>
>>
>>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
> For additional commands, e-mail: users-help@qpid.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@qpid.apache.org
For additional commands, e-mail: users-help@qpid.apache.org


Mime
View raw message