Return-Path: Delivered-To: apmail-qpid-users-archive@www.apache.org Received: (qmail 85834 invoked from network); 23 Aug 2010 22:44:22 -0000 Received: from unknown (HELO mail.apache.org) (140.211.11.3) by 140.211.11.9 with SMTP; 23 Aug 2010 22:44:22 -0000 Received: (qmail 96588 invoked by uid 500); 23 Aug 2010 22:44:22 -0000 Delivered-To: apmail-qpid-users-archive@qpid.apache.org Received: (qmail 96527 invoked by uid 500); 23 Aug 2010 22:44:21 -0000 Mailing-List: contact users-help@qpid.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: users@qpid.apache.org Delivered-To: mailing list users@qpid.apache.org Received: (qmail 96511 invoked by uid 99); 23 Aug 2010 22:44:21 -0000 Received: from Unknown (HELO nike.apache.org) (192.87.106.230) by apache.org (qpsmtpd/0.29) with ESMTP; Mon, 23 Aug 2010 22:44:21 +0000 X-ASF-Spam-Status: No, hits=0.7 required=10.0 tests=FREEMAIL_FROM,SPF_NEUTRAL,T_TO_NO_BRKTS_FREEMAIL X-Spam-Check-By: apache.org Received-SPF: neutral (nike.apache.org: local policy) Received: from [77.238.189.168] (HELO web29612.mail.ird.yahoo.com) (77.238.189.168) by apache.org (qpsmtpd/0.29) with SMTP; Mon, 23 Aug 2010 22:43:58 +0000 Received: (qmail 62779 invoked by uid 60001); 23 Aug 2010 22:43:37 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yahoo.co.uk; s=s1024; t=1282603417; bh=+DKymi050kyiNpV0c2xa6zW7vbt7CN6Qzdy6CxK8G1o=; h=Message-ID:X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=PoNdKFfDqERj+2yttKiTnAp/mG8ybL/jW0Lz1S1xB2o71hLCtY60g9z7Sg2Rr9hCqsRwYoa4IWCr6xDaH8nbaB9KTqfmwDIw9K4bKecLjspz88a42q4RTm10rPFaKIk1dBko67UfErNxsTnKqnuzOdLJS46tsn3N/pMpcraLWWg= DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=s1024; d=yahoo.co.uk; h=Message-ID:X-YMail-OSG:Received:X-Mailer:References:Date:From:Subject:To:In-Reply-To:MIME-Version:Content-Type:Content-Transfer-Encoding; b=0D0WzqoglcT3/BuTLMDkwDDLeT03Jn5wt27QE7D2XzHzKakiB6GGpIReV9DhxyGKAX8TC9E1/x/UhJL0efpcM12W4cvHQKPui8jyyGt5e9a5DtTpY8IyTg71/NfIW09NIVrtzUMO/C5Edt/x7vLvflvkXpExuktOalGN5DA6u+Q=; Message-ID: <715293.62015.qm@web29612.mail.ird.yahoo.com> X-YMail-OSG: slEoIAQVM1lmy8ciZG61C_MyNpVfGML3hO3EVynR0cEsUDI qghiq1OZdLXex9BxVA7SMthZY5sfZyxygj46BO.tFhE7975eFymtDQ1cpwgv dYGl.1di6qX__1eC5WVR6Gxz.opEb1QIwJwdQzm83W5D2Hp.klzAEnjhSY_8 N47CR1xFNXCHghhbRB0CKzlFDeCi4oyMjoMTBPylry_2Ks5V7O4FvL2hzGHw .d6D1tX2LU964xvnBvb7khVtrTn8VS7dJMx9byRP5zcFeR1qwXoJG0RoSh.N cLYr6vQmEc3bVeksZ7Uo- Received: from [62.31.55.150] by web29612.mail.ird.yahoo.com via HTTP; Mon, 23 Aug 2010 22:43:37 GMT X-Mailer: YahooMailRC/470 YahooMailWebService/0.8.105.279950 References: <618985.48049.qm@web29618.mail.ird.yahoo.com> <4C6D7670.1030103@redhat.com> <71086.6304.qm@web29606.mail.ird.yahoo.com> <4C6E8E47.1030205@redhat.com> <945680.79605.qm@web29606.mail.ird.yahoo.com> <4C727CA5.30001@redhat.com> Date: Mon, 23 Aug 2010 22:43:37 +0000 (GMT) From: MHT Subject: Re: Clustering not working To: users@qpid.apache.org In-Reply-To: <4C727CA5.30001@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Virus-Checked: Checked by ClamAV on apache.org Hi,=0A=0AI followed the starting/running a cluster info for the build of th= is new cluster =0Aas I had done for the others. =0A=0A=0ASome output from = a working pair of clustered servers:=0A=0Acat /proc/net/igmp=0A=0AIdx D= evice : Count Querier Group Users Timer Reporter=0A1 l= o : 0 V3=0A 010000E0 1 0= :00000000 0=0A2 eth0 : 2 V2=0A = 0A016FEF 1 0:00000000 1=0A = 010000E0 1 0:00000000 0=0A=0Anetsta= t -g=0AIPv6/IPv4 Group Memberships=0AInterface RefCnt Group=0A-------= -------- ------ ---------------------=0Alo 1 ALL-SYSTEMS.= MCAST.NET=0Aeth0 1 239.111.1.10=0Aeth0 1 AL= L-SYSTEMS.MCAST.NET=0A=0Atcpdump -i eth0 dst port 5405=0Atcpdump: verbose o= utput suppressed, use -v or -vv for full protocol decode=0Alistening on eth= 0, link-type EN10MB (Ethernet), capture size 96 bytes=0A23:25:37.952006 IP = [server1].5149 > [server2].netsupport: UDP, length 70=0A23:25:37.952481 IP = [server2].5149 > [server1].netsupport: UDP, length 70=0A23:25:38.141940 IP = [server1].5149 > [server2].netsupport: UDP, length 70=0A23:25:38.142372 IP = [server2].5149 > [server1].netsupport: UDP, length 70=0A23:25:38.161920 IP = [server1].5149 > 239.111.1.10.netsupport: UDP, length 82=0A23:25:38.331888 = IP [server1].5149 > [server2].netsupport: UDP, length 70=0A23:25:38.332256 = IP [server2].5149 > [server1].netsupport: UDP, length 70=0A23:25:38.521795 = IP [server1].5149 > [server2].netsupport: UDP, length 70=0A23:25:38.522190 = IP [server2].5149 > [server1].netsupport: UDP, length 70=0A=0A=0AFor the pa= ir that isn't working I get:=0A=0AIdx Device : Count Querier G= roup Users Timer Reporter=0A1 lo : 0 V3=0A = 010000E0 1 0:00000000 0=0A2 = eth0 : 2 V3=0A 04016FEF = 1 0:00000000 0=0A 010000E0 = 1 0:00000000 0=0A=0Anetstat -g=0AIPv6/IPv4 Group Membershi= ps=0AInterface RefCnt Group=0A--------------- ------ ----------------= -----=0Alo 1 ALL-SYSTEMS.MCAST.NET=0Aeth0 1 = 239.111.1.4=0Aeth0 1 ALL-SYSTEMS.MCAST.NET=0A=0Atcpdump = -i eth0 dst port 5405=0Atcpdump: verbose output suppressed, use -v or -vv f= or full protocol decode=0Alistening on eth0, link-type EN10MB (Ethernet), c= apture size 96 bytes=0A23:31:13.663055 IP [server1].5149 > 239.111.1.4.nets= upport: UDP, length 82=0A23:31:14.042033 IP [server1].5149 > 239.111.1.4.ne= tsupport: UDP, length 82=0A23:31:14.421028 IP [server1].5149 > 239.111.1.4.= netsupport: UDP, length 82=0A23:31:14.799024 IP [server1].5149 > 239.111.1.= 4.netsupport: UDP, length 82=0A=0ASo presume the tcpdump shows that the pac= ket only gets as far as the multicast =0Aaddress but nothing receiving? So= rry, I'm a numpty when it comes to networks. =0AAlso, the Reporter column = in /proc/net/igmp shows all zeros. The servers that =0Ahost the nodes where= there's an issue are on the same vlan with no firewall =0Abetween them, an= d selinux is disabled. =0A=0A=0A=0A=0A=0A=0A=0A----- Original Message ----= =0AFrom: Alan Conway =0ATo: users@qpid.apache.org=0ACc:= MHT =0ASent: Mon, 23 August, 2010 14:50:29=0ASubject: R= e: Clustering not working=0A=0AOn 08/20/2010 11:56 AM, MHT wrote:=0A> Hi,= =0A>=0A> On a working cluster I see the expected node joins in logs for bot= h boxes:=0A> [TOTEM] entering OPERATIONAL state.=0A> [CLM ] got nodejo= in message=0A> [CLM ] got nodejoin message=0A>=0A>= =0A>=0A> But on this problem one I only see the local instance on both boxe= s:=0A> [TOTEM] entering OPERATIONAL state.=0A> [CLM ] got nodejoin mes= sage=0A>=0A> I've got logging on the brokers set to trace, but s= o far still not seeing any=0A> obvious errors in the mass (other than the m= issing node join). A diff on the=0A> config file on each box shows only cl= uster-url is different, as expected =0A>because=0A> it starts with the loca= l broker address:port.=0A>=0A=0AThere are some troubleshooting tips for con= figuring openais and qpidd at =0Ahttps://cwiki.apache.org/qpid/starting-a-c= luster.html. If you're not seeing all =0Athe expected nodejoin messages the= n it sounds like a probelm with openais =0Aconfiguration.=0A=0A------------= ---------------------------------------------------------=0AApache Qpid - A= MQP Messaging Implementation=0AProject: http://qpid.apache.org=0AUse/I= nteract: mailto:users-subscribe@qpid.apache.org=0A=0A=0A --------------------------------------------------------------------- Apache Qpid - AMQP Messaging Implementation Project: http://qpid.apache.org Use/Interact: mailto:users-subscribe@qpid.apache.org