qpid-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Mark Moseley <moseleym...@gmail.com>
Subject Re: qpid-tool knocks out cluster partner
Date Wed, 19 Jan 2011 19:05:52 GMT
On Wed, Jan 19, 2011 at 6:17 AM, Alan Conway <aconway@redhat.com> wrote:
> On 01/18/2011 08:04 PM, Mark Moseley wrote:
>>
>> On Tue, Jan 18, 2011 at 12:53 PM, Alan Conway<aconway@redhat.com>  wrote:
>>>
>>> On 01/10/2011 09:12 AM, Alan Conway wrote:
>>>>
>>>> On 01/07/2011 07:55 PM, Mark Moseley wrote:
>>>>>
>>>>> On Thu, Jan 6, 2011 at 12:47 PM, Alan Conway<aconway@redhat.com>
>>>>>  wrote:
>>>>>>
>>>>>> On 12/29/2010 02:11 PM, Mark Moseley wrote:
>>>>>>>
>>>>>>> This might be the same as
>>>>>>> https://issues.apache.org/jira/browse/QPID-2982 but in case it's
not,
>>>>>>> I'm dropping this email. If I connect to qpid-tool on member
A of a
>>>>>>> cluster and do just about anything, e.g. list binding, list exchange,
>>>>>>> etc, the other node, B, blows up. In the logs below, exp01==A
and
>>>>>>> exp02==B.
>>>>>>> [snip]
>>>>>
>>>>> I've commented on that JIRA. I hope my info is useful. It's getting
>>>>> kind of convoluted :)
>>>>
>>>> Thanks, I'll try it out and see if I can reproduce it. It will be very
>>>> helpful
>>>> if I can.
>>>>
>>>
>>> I believe I've fixed https://issues.apache.org/jira/browse/QPID-2982 on
>>> trunk r1060568. Can you give it a spin and let me know how it goes?
>>
>> Just started testing a little while ago but so far I haven't seen a
>> single crash yet using the same steps I posted in the JIRA, so it
>> looks pretty good so far. I'll post again if I see any crashes.
>>
>
> That's good. Can you also re-test 2992 and 2993? I think they may also be
> fixed by this patch.

No dice on 2992 and 2993. They both still have the same issue. And for
2993, it still can kill off a cluster node. In the 2993 case, if I've
done a restart of B1/B2 and the federated route is gone when they come
back up, when I go to add it back on B1, it fairly regularly kills B1
with this:


2011-01-19 13:58:36 debug cluster(201.0.0.0:7701 READY) replicated
connection HOSTA1:5672(202.0.0.0:18335-1 shadow)
2011-01-19 13:58:38 debug Exception constructed: Channel 1 is not
attached (qpid/amqp_0_10/SessionHandler.cpp:39)
2011-01-19 13:58:38 error Channel exception: not-attached: Channel 1
is not attached (qpid/amqp_0_10/SessionHandler.cpp:39)
2011-01-19 13:58:38 debug cluster(201.0.0.0:7701 READY/error) channel
error 710 on HOSTA1:5672(202.0.0.0:18335-1 shadow) must be resolved
with: 201.0.0.0:7701 202.0.0.0:18335 : not-attached: Channel 1 is not
attached (qpid/amqp_0_10/SessionHandler.cpp:39)
2011-01-19 13:58:38 debug cluster(201.0.0.0:7701 READY/error) error
710 resolved with 201.0.0.0:7701
2011-01-19 13:58:38 debug cluster(201.0.0.0:7701 READY/error) error
710 must be resolved with 202.0.0.0:18335
2011-01-19 13:58:38 critical cluster(201.0.0.0:7701 READY/error) local
error 710 did not occur on member 202.0.0.0:18335: not-attached:
Channel 1 is not attached (qpid/amqp_0_10/SessionHandler.cpp:39)
2011-01-19 13:58:38 debug Exception constructed: local error did not
occur on all cluster members : not-attached: Channel 1 is not attached
(qpid/amqp_0_10/SessionHandler.cpp:39)
(qpid/cluster/ErrorCheck.cpp:89)
2011-01-19 13:58:38 critical Error delivering frames: local error did
not occur on all cluster members : not-attached: Channel 1 is not
attached (qpid/amqp_0_10/SessionHandler.cpp:39)
(qpid/cluster/ErrorCheck.cpp:89)
2011-01-19 13:58:38 notice cluster(201.0.0.0:7701 LEFT/error) leaving
cluster bosclust
2011-01-19 13:58:38 debug SEND raiseEvent (v1)
class=org.apache.qpid.broker.clientDisconnect
2011-01-19 13:58:38 debug DISCONNECTED [10.1.58.3:41680]
2011-01-19 13:58:38 debug SEND raiseEvent (v1)
class=org.apache.qpid.broker.clientDisconnect
2011-01-19 13:58:38 debug Shutting down CPG
2011-01-19 13:58:38 notice Shut down
2011-01-19 13:58:38 debug Journal "bosmyq1": Destroyed
2011-01-19 13:58:38 debug Journal "TplStore": Destroyed


For 2992, the route doesn't reappear but I haven't seen it kill a
cluster node yet, only in the 2993 case.

---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:users-subscribe@qpid.apache.org


Mime
View raw message