geronimo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Piper <an...@bea.com>
Subject Re: Replication using totem protocol
Date Thu, 02 Feb 2006 12:41:28 GMT
At 09:25 AM 1/18/2006, Jules Gosnell wrote:
>I haven't been able to convince myself to take the quorum approach because...
>
>shared-something approach:
>- the shared something is a Single Point of Failure (SPoF) - 
>although you could use an HA something.

That's how WAS and WLS do it. Use an HA database, SAN or dual-ported 
scsi. The latter is cheap. The former are probably already available 
to customers if they really care about availability.

>- If the node holding the lock 'goes crazy', but does not die, the 
>rest of the

This is generally why you use leases. Then your craziness is only 
believed for a fixed amount of time.

>cluster becomes a fragment - so it becomes an SPoF as well.
>- used in isolation, it does not take into account that the lock may 
>be held by the smallest cluster fragment

You generally solve this again with leases. i.e. a lock that is valid 
for some period.

>shared-nothing approach:

Nice in theory but tricky to implement well. Consensus works well here.

>- I prefer this approach, but, as you have stated, if the two halves 
>are equally sized...
>- What if there are two concurrent fractures (does this happen?)
>- ActiveCluster notifies you of one membership change at a time - so 
>you would have to decide on an algorithm for 'chunking' node loss, 
>so that you could decide when a fragmentation had occurred...

If you really want to do this reliably you have to assume that AC 
will send you bogus notifications. Ideally you want to achieve a 
consensus on membership to avoid this. It sounds like totem solves 
some of these issues.

andy 


Mime
View raw message