geronimo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jules Gosnell <ju...@coredevelopers.net>
Subject Re: Replication using totem protocol
Date Tue, 17 Jan 2006 00:11:23 GMT
lichtner wrote:

>On Mon, 16 Jan 2006, Jules Gosnell wrote:
>
>  
>
>>>2. When an HTTP request arrives, if the cluster which received does not
>>>have R copies then it blocks (it waits until there are.) This should in
>>>data centers because partitions are likely to be very short-lived (aka
>>>virtual partitions, which are due to congestion, not to any hardware
>>>issue.)
>>>
>>>
>>>      
>>>
>>Interesting. I was intending to actively repopulate the cluster
>>fragment, as soon as the split was detected. I figure that
>>- the longer that sessions spend without their full complement of
>>backups, the more likely that a further failure may result in data loss.
>>- the split is an exceptional cicumstance at which you would expect to
>>pay an exceptional cost (regenerating missing primaries from backups and
>>vice-versa)
>>
>>by waiting for a request to arrive for a session before ensuring it has
>>its correct complement of backups, you extend the time during which it
>>is 'at risk'. By doing this 'lazily', you will also have to perform an
>>additional check on every request arrival, which you would not have to
>>do if you had regenerated missing state at the point that you noticed
>>the split.
>>    
>>
>
>Actually I didn't mean to say that you should do it lazily. You most
>definitely do it aggressively, but I would not try to do _all_ the state
>transfer ASAP, because this can kill availability.
>  
>
Ah - OK, my misunderstanding - so you do it agressively but there is 
still the possibility of a request arriving before you have finished 
regenerating, so you handle that by holding it up - got you. I agree.

>If I had to do the state transfer using totem I would use priority queues,
>so that you know that while the system is doing state transfer it is still
>operating at, say, 80% efficiency.
>
>It was not about lazy vs. greedy.
>
>I believe that if you put some spare capacity in your cluster you will get
>good availability. For example, if your minimum R is 2 and the normal
>operating value is 4, when a node fails you will not be frantically doing
>state transfer.
>  
>
OK - so your system is a little more relaxed about the exact number of 
replicants. You specify upper and lower bounds rather  than an absolute 
number, then you move towards the upper bound when you have the capacity ?

>  
>
>>>3. If at any time an HTTP reaches a server which does not have itself a
>>>replica of the session it sends a client redirect to a node which does.
>>>
>>>
>>>      
>>>
>>WADI can relocate request to session, as you suggest (via redirect or
>>proxy), or session to request, by migration. Relocation of request
>>should scale better since requests are generally smaller and, in the web
>>tier, may run concurrently through the same session, whereas sessions
>>are generally larger and may only be migrated serially (since only one
>>copy at a time may be 'active').
>>    
>>
>
>I would also just send a redirect. I don't think it's worth relocating a
>session.
>  
>
If you can communicate the session's location to the load-balancer, then 
I agree, but some load-balancers are pretty dumb :-)

>  
>
>>>and possibly migration of some session for
>>>proper load balancing.
>>>
>>>
>>>      
>>>
>>forcing the balancing of state around the cluster is something that I
>>have considered with WADI, but not yet tried to implement. The type of
>>load-balancer that is being used has a big impact here. If you cannot
>>communicate a change of session location satisfactorily to the Http load
>>balancer, then you have to just go with wherever it decides a session is
>>located.... With SFSBs we should have much more control at the client
>>side, so this becomes a real option.
>>    
>>
>
>In my opinion load balancing is not something that a cluster api can
>address effectively. Half the problem is evaluating how busy the system is
>in the first place.
>
>  
>
agreed

>>all in all, though, it sounds like we see pretty much eye to eye :-)
>>    
>>
>
>Better than the other way ..
>
>  
>
>>the lazy partition regeneration is an interesting idea and this is the
>>second time it has been suggested to me, so I will give it some serious
>>thought.
>>    
>>
>
>Again, I wasn't advocating lazy state transfer. But perhaps it has
>applications somewhere.
>
>  
>
understood - and I think a hybrid approach will probably just incur the 
costs of both the other approaches - but I may still kick it around.


Jules

>>Thanks for taking the time to share your thoughts,
>>    
>>
>
>No problem.
>  
>


-- 
"Open Source is a self-assembling organism. You dangle a piece of
string into a super-saturated solution and a whole operating-system
crystallises out around it."

/**********************************
 * Jules Gosnell
 * Partner
 * Core Developers Network (Europe)
 *
 *    www.coredevelopers.net
 *
 * Open Source Training & Support.
 **********************************/


Mime
View raw message