geronimo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Strachan <>
Subject Re: Web State Replication... (long)
Date Thu, 30 Oct 2003 17:59:40 GMT
On Thursday, October 30, 2003, at 05:43  pm, Jules Gosnell wrote:
> James Strachan wrote:
>> n Thursday, October 30, 2003, at 12:19  pm, gianny DAMOUR wrote:
>>> Hello,
>>> Just a couple of questions regarding this design:
>>> - Is it possible to configure the weight of a node? If yes, is the 
>>> same auto-partitioning policy applicable? My concern is that a 
>>> "clockwise" policy may add a significant load on nodes hosted by low 
>>> spec hosts.
>> This is partly a problem for the sticky load balancer to deal with 
>> i.e. it should load requests to primary machines based on spec/power.
>> If we partitioned the session data into buckets (rather than one big 
>> lump), then the buckets of session data can be distributed evenly 
>> around the cluster so that each session bucket has N buddies 
>> (replicas) but that a load-balancing algorithm could be used to 
>> distribute the buckets based on (say) a host spec weighting or 
>> whatnot. e.g. nodes in the cluster could limit how many buckets to 
>> accept due to their lack of resources etc.
>> Imagine having 1 massive box and 2 small ones in a cluster - you'd 
>> probably want to give the big box more buckets than the smaller ones. 
>> The previous model Jules described still holds (that was a view of 1 
>> session bucket) - its just that the total session state for a machine 
>> might be spread over many buckets.
>> Having multiple buckets could also help spread the load of recovering 
>> from a node failure in larger clusters.
> James, I have given this quite a bit of thought... and whilst it was 
> initially appealing and seemed a sensible extension of my train of 
> thought, I have not been able to find any advantage in splitting one 
> nodes state into mutiple buckets....

What if 1 box has 1M sessions and another box has 1K sessions?

> If a node joins or leaves, you still have exactly the same amount of 
> state to shift around the cluster.

Sure. Though you can spread the load onto more boxes if you go the 
bucket route. i.e. rather than 1 node sending 1M sessions, 10 nodes 
could send 100K each & spread the load. Its no biggie or anything, just 
an optimisation some folks might want.

> If you back up your sessions off-node, then whether these are all on 
> one backup node, or spread over 10 makes no difference, since in the 
> first case if you lose the backup node you have to shift 100% x 1 
> nodes state. In the second case you have to shift 10% x 10 nodes state 
> (since the backup node will be carrying 10% of the state of another 9 
> nodes as well as your own). Initially it looks more resilient but...

Sure - the same amount of state needs to move. The difference is how 
you hit the other *working* nodes when a failure occurs. Sometimes in 
large clusters, getting every member to do a little bit is better than 
a small number of nodes having to migrate mucho sessions. For small 
clusters of 2-4 nodes there's little point in buckets (unless the boxes 
are of very different sizes)

> So I am sticking, by virtue of Occam's razor, to the simpler approach 
> for them moment, until someone can draw attention to a situation where 
> the extra complexity of a higher granularity replication strategy is 
> worth the gain.

Sure - KISS. Lets get it working first :)

> Thinking about it, my current design is probably hybrid - since whilst 
> a nodes state is all held in a single bucket, individual sessions may 
> be migrated out of that bucket and into another one on another node. 
> So it is replication granularity that is set to node-level, but 
> migration granularity is at session level. I guess you are suggesting 
> that a bucket is somewhere between the two of these and is the level 
> at which both are replicated and migrated ? I'll give it some more 
> thought :-)

Sure - thats all I was saying really. When 1M session objects have to 
move, it'd be better to move 100 buckets in one go than move 1M session 
objects one by one. (Though from the load balancer you might have to 
move every single session - from the session cluster in Geronimo-land 
you can move buckets of sessions in one go).


View raw message