tomcat-users mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Filip Hanik - Dev Lists <devli...@hanik.com>
Subject Re: Basic Tribes Questions
Date Tue, 07 Oct 2008 21:13:03 GMT
hi Mike, that's great. yes, the TCP failure detector could give multiple 
"DISAPPEARED" messages, that is something I'm about to fix

Filip

Mike Wannamaker wrote:
> Hi Filip,
>
> I think I am seeing the message, it was just hidden amongst other log messages I guess
I missed it.
>
> However I do see something else when I added the TcpFailureDetector to the interceptor
list, I see two DISAPPEARED messages?
>
> Without TcpFailureDetector:
>
> 	1) Start Server #1, then #2
> 	2) Unplug #2 network
> 	3) On #1 - #2 DISAPPEARED, on #2 - #1 DISAPPEARED
> 	4) Reconnect #2 to network, on #1 - #2 SHUTDOWN;#2 ADDED, on #2 - #1 ADDED
>
> Add TcpFailureDetector
>
> 	1) Start Server #1, #2
> 	2) Unplug #2 network
> 	3) On #1 - #2 DISAPPEARED;#2 DISAPPEARED, on #2 - #1 DISAPPEARED;#1 DISAPPEARED
> 	4) Reconnect #2 to network, on #1 - #2 SHUTDOWN;#2 ADDED, on #2 - #1 ADDED
>
> I take it I get the 2 DISAPPEARED messages because I have another interceptor, but is
this the correct behaviour?
>
> TIA
> Mike
>
>
>
> -----Original Message-----
> From: Filip Hanik - Dev Lists [mailto:devlists@hanik.com] 
> Sent: October 6, 2008 11:28 AM
> To: Tomcat Users List
> Subject: Re: Basic Tribes Questions
>
> there are getters and setters for everything
> and they are all documented here
> http://tomcat.apache.org/tomcat-6.0-doc/config/cluster-channel.html
>
> each component has getters/setters, for example, the multicast address
>
> setAddress
> getAddress
>
> breakpoints might not work very well, since you are stopping one thread, 
> and not really emulating a real scenario.
>
> again, sounds like you have a simple test case, if you can share that, I 
> can get more understanding, and help you further.
>
> Filip
>
> Mike Wannamaker wrote:
>   
>> Hi Filip
>>
>> Thanks for the info.  However, I don't see the documentation for the setters/getters
you mention below?
>> Also I'm having issues while debugging.  When I hit a breakpoint in my code and while
stepping thru code, I get DISAPPEARED/ADDED messages over and over on the other server?  I
would think the heartbeat is running in a separate thread for both send/receive?  How to solve
this, bump the heartbeat timeout?
>>
>> TIA
>> Mike
>>
>> -----Original Message-----
>> From: Filip Hanik - Dev Lists [mailto:devlists@hanik.com] 
>> Sent: October 3, 2008 2:51 PM
>> To: Tomcat Users List
>> Subject: Re: Basic Tribes Questions
>>
>> answers inline
>>
>> Mike Wannamaker wrote:
>>   
>>     
>>> Hi, I am currently trying to use Tribes as the clustering layer on our server.
>>>
>>> My startup code looks like this.
>>>
>>>         if(_tribesChannel == null)
>>>         { // nothing to do if already running
>>>             try
>>>             {
>>>                 _tribesChannel = new GroupChannel();
>>>                 // must be done before start:
>>>   
>>>     
>>>       
>> no need to use any properties, there are getters and setters for everything
>> and they are all documented here
>> http://tomcat.apache.org/tomcat-6.0-doc/config/cluster-channel.html
>>   
>>     
>>>                 _tribesChannel.getMembershipService().getProperties().put("mcastPort",
String.valueOf(_mainPort));
>>>                 _tribesChannel.getMembershipService().getProperties().put("mcastAddress",
_multicastIPAddr);
>>>   
>>>     
>>>       
>> not sure what you are trying to do in the code below. if you wanna set 
>> the port, then simply do it.
>> the membership will pick it up automatically
>>   
>>     
>>>                 if(_ancillaryPort > 0)
>>>                 {
>>>                     _tribesChannel.getMembershipService().getProperties().put("tcpListenPort",
String.valueOf(_ancillaryPort));
>>>                     // hack alert: Default Tribes instantiation (Tomcat 6.0.16)
does not read value for "tcpListenPort" from properties.
>>>                     // Therefore, set it directly
>>>                     ChannelReceiver receiver = _tribesChannel.getChannelReceiver();
>>>                     if(receiver.getPort() != _ancillaryPort)
>>>                     {
>>>                         if(receiver instanceof ReceiverBase)
>>>                         {
>>>                             ((ReceiverBase)receiver).setPort(_ancillaryPort);
>>>                         }
>>>                     }
>>>                 }
>>>
>>>                 _tribesChannel.addMembershipListener(_tribesMembershipListener);
>>>                 _tribesChannel.addChannelListener(_tribesChannelListener);
>>>                 _tribesChannel.start(CHANNEL_COMPONENTS);
>>>             }
>>>             catch(ChannelException ex)
>>>             {
>>>                 try { _tribesChannel.stop(CHANNEL_COMPONENTS); } catch(Throwable
t) { /*gulp*/}
>>>                 _tribesChannel = null;
>>>                 throw new RuntimeException(ex); // todo, exception handling?
>>>             }
>>>         }
>>>
>>> My Question is that when I start Server #1, then Server #2, then unplug Server
#2 network cable, Server #1 gets the DISAPPEARED message but Server #2 just keeps logging
the message below.  As I write this it's at attempt #120.  How do I get this to notify on
Server #2 that Server #1 has DISAPPEARED or can I set the number of attempts to a maximum
number before notifying?
>>>   
>>>     
>>>       
>> the message you are getting is cause the membership tries to recover.
>> you can limit this, by doing
>> setRecoveryEnabled(true|false);
>> setRecoveryCounter(nr-of-times-to-try-recover)
>>
>> also, you haven't added in the TCP failure detector, which adds one more 
>> layer of protection
>> see it in this default configuration
>> http://tomcat.apache.org/tomcat-6.0-doc/cluster-howto.html
>>
>> you should still get the member disappeared error (eventually after the 
>> timeout), if you can supply a simple test case, I can try it out over here.
>>
>> best
>> Filip
>>
>>   
>>     
>>> Also is there any other documentation for tribes, other than the limited docs
on the apache site?
>>>
>>> 	INFO: Done sleeping, membership established, start level:8
>>> 	Oct 3, 2008 1:12:44 PM org.apache.catalina.tribes.transport.nio.NioReplicationTask
run
>>> 	WARNING: IOException in replication worker, unable to drain channel. Probable
cause: Keep alive socket closed[An existing connection was 	forcibly closed by the remote
host].
>>> 	Oct 3, 2008 1:12:44 PM org.apache.catalina.tribes.membership.McastServiceImpl$SenderThread
run
>>> 	WARNING: Unable to send mcast message.
>>> 	java.net.NoRouteToHostException: No route to host: Datagram send failed
>>> 		at java.net.PlainDatagramSocketImpl.send(Native Method)
>>> 		at java.net.DatagramSocket.send(DatagramSocket.java:612)
>>> 		at org.apache.catalina.tribes.membership.McastServiceImpl.send(McastServiceImpl.java:385)
>>> 		at org.apache.catalina.tribes.membership.McastServiceImpl$SenderThread.run(McastServiceImpl.java:445)
>>> 	Oct 3, 2008 1:12:49 PM org.apache.catalina.tribes.membership.McastServiceImpl$RecoveryThread
run
>>> 	INFO: Tribes membership, running recovery thread, multicasting is not functional.
>>> 	Oct 3, 2008 1:12:49 PM org.apache.catalina.tribes.membership.McastServiceImpl$RecoveryThread
stopService
>>> 	WARNING: Recovery thread failed to stop membership service.
>>> 	java.net.NoRouteToHostException: No route to host: Datagram send failed
>>> 		at java.net.PlainDatagramSocketImpl.send(Native Method)
>>> 		at java.net.DatagramSocket.send(DatagramSocket.java:612)
>>> 		at org.apache.catalina.tribes.membership.McastServiceImpl.send(McastServiceImpl.java:385)
>>> 		at org.apache.catalina.tribes.membership.McastServiceImpl.stop(McastServiceImpl.java:299)
>>> 		at org.apache.catalina.tribes.membership.McastServiceImpl$RecoveryThread.stopService(McastServiceImpl.java:480)
>>> 		at org.apache.catalina.tribes.membership.McastServiceImpl$RecoveryThread.run(McastServiceImpl.java:504)
>>> 	Oct 3, 2008 1:12:49 PM org.apache.catalina.tribes.membership.McastServiceImpl
setupSocket
>>> 	INFO: Setting cluster mcast soTimeout to 500
>>> 	Oct 3, 2008 1:12:49 PM org.apache.catalina.tribes.membership.McastServiceImpl$RecoveryThread
startService
>>> 	WARNING: Recovery thread failed to start membership service.
>>> 	java.net.SocketException: error setting options
>>> 		at java.net.PlainDatagramSocketImpl.join(Native Method)
>>> 		at java.net.PlainDatagramSocketImpl.join(PlainDatagramSocketImpl.java:172)
>>> 		at java.net.MulticastSocket.joinGroup(MulticastSocket.java:276)
>>> 		at org.apache.catalina.tribes.membership.McastServiceImpl.start(McastServiceImpl.java:233)
>>> 		at org.apache.catalina.tribes.membership.McastServiceImpl$RecoveryThread.startService(McastServiceImpl.java:490)
>>> 		at org.apache.catalina.tribes.membership.McastServiceImpl$RecoveryThread.run(McastServiceImpl.java:504)
>>> 	Oct 3, 2008 1:12:49 PM org.apache.catalina.tribes.membership.McastServiceImpl$RecoveryThread
run
>>> 	INFO: Recovery attempt 1 failed, trying again in 5000 seconds
>>> 	
>>>   
>>>     
>>>       
>> ---------------------------------------------------------------------
>> To start a new topic, e-mail: users@tomcat.apache.org
>> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>> For additional commands, e-mail: users-help@tomcat.apache.org
>>
>>   
>>     
>
>
> ---------------------------------------------------------------------
> To start a new topic, e-mail: users@tomcat.apache.org
> To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: users-help@tomcat.apache.org
>
>   


---------------------------------------------------------------------
To start a new topic, e-mail: users@tomcat.apache.org
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Mime
View raw message