geronimo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From <>
Subject RE: Clustering - JGroups issues and others
Date Wed, 19 Oct 2005 09:40:18 GMT
Hello there... 

Answers/comments are down...


>-----Original Message-----
>From: ext Jules Gosnell [] 
>Sent: 18 October, 2005 17:41
>Subject: Re: Clustering - JGroups issues and others
> wrote:
>>Here is my 5 cents... I have some comments regarding clustering based 
>>on J-Groups. We were trying to use this technology and came 
>to certain 
>>points, that render it unusable in our case.
>>Many of the cluster caches/replicates assume that all the information 
>>propagated to all the nodes in the cluster. Some of the solutions 
>>propagate only keys, however. In any case this solution can 
>not be used 
>>in sufficiently large clusters as the rate of upates would 
>eat all the 
>>node capacity making it unusable.
>This is the dreaded 1->all replication that is a popular 
>implementation at the moment. See my previous mail about 
>wadi's avoidance of this giving it a significant advantage 
>over such solutions, in terms of scalability.
>>Regarding J-Groups itself. Probably that is specific to cluster
>>facilities in JBoss, but generally J-Groups organize a list of nodes,
>>and every node checks the state of the next one in the chain.
>I wasn't sure how it worked... interesting ...
>We should look into how membership is tracked by ActiveCluster.
>> The
>>problem is that in many cases servers may fail/disconnect in groups,
>>which causes two problems: the segmentation of the cluster 
>cluster segmentation is a really tricky issue :-( - do all the 
>then try to arrange themselves into smaller clusters, shifting 
>loads of 
>state around, or is jgroups smart enough to put all the pieces back 
>together before passing control back to the application ?

It is the problem of "homogenius" environment. Blade servers are
naturally organized in chassis called enclosures (HP's term) or
bladecenters (IBM's one). All those chassis are interconnected with each
other using one or more external switches. Due to the fact that larger
solutions tend to use multiple VLANs, the failure of each of them can be
independent from the others. So it can occur, that the group of nodes
will loose connectivity to all the rest of the cluster for some time (HA
implied), but will see each other and also other backend services like
database etc.

That effectively leads to a situation that instead of single cluster we
get two (three, four) smaller ones. If applicatioins are distributing
services automatically the final result is a mess. Imagine a service
that has to run on a single node, but starts on two or more...

The JGroups merges the groups together after network recovery (note that
JBoss sometimes doesn't - quite buggy), but the harm has been done

>>extremelly high failure report time, as for architectures 
>based on blade
>>technology servers shut down in large packs
>do these 'packs' correspond to racks ? I have plans (NYI) for 
>algorithms that will allow WADI to choose e.g. nodes in other 
>racks, on 
>other power sources, in other buildings etc as replication partners, 
>otherwise you will lose state in a situation like this, if you 
>happen to 
>have yours backed up on to the node next to you in the same rack...

Blade chassis (encusures, bladecenters, etc). Also see above. The issue
is that the network failover technologies have a certain reaction time:
from few seconds to minutes depending on the technology used.

>> and it really takes time to
>>detect several sequentally disconnected servers.
>What sort of lag are we talking about - a few seconds, or a 
>few tens of 
>seconds ?

Up to few minutes. For a cluster of 14 machines when 10 of them povered
down (through management interface) it can take several minutes. The
default transport there is TCP which adds own problems, as TCP timeouts
are huge (epecially in wireless environment using wTCP settings ;-) ).

>>To overcome the problems we ended up with the "star" 
>architecture, where
>>the central node is responsible for maintaining the list of 
>other nodes.
>>The availability of the central node itself could be provided with
>>facilities like Red Hat Cluster Suite or similar (service failover,
>>floating IPs, etc).
>Hmmm.. - I understand why you went for this architecture, but I would 
>prefer to find one that is homogeneous - i.e. we don't need a special, 
>non-standard configuration for the central node. Deployment is much 
>easier if every node has the configuration. Still, this is good input 
>and has got me thinking in a direction which I had not really 

I have no intention to force you for using our solution at all. Just
some points for the cases when such a solution is not applicable. 

>Thanks, Valeri,
>"Open Source is a self-assembling organism. You dangle a piece of
>string into a super-saturated solution and a whole operating-system
>crystallises out around it."
> * Jules Gosnell
> * Partner
> * Core Developers Network (Europe)
> *
> *
> *
> * Open Source Training & Support.
> **********************************/

View raw message