geronimo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Andy Piper <>
Subject Re: Clustering (long)
Date Thu, 21 Jul 2005 14:31:16 GMT
Hi Jules

It sounds like you've been working hard!

I think you might find you run into reliability issues with a singleton 
coordinator. This is one of those well known Hard Problems and for session 
replication its not really necessary. In essence the coordinator is a 
single point of failure and making it reliable is provably non trivial.

Cluster re-balancing is also a hairy issue in that its easy to run into 
cascading failures if you pro-actively move sessions when a server leaves 
the cluster.

I'm happy to talk more about these issues off-line if you want.



At 02:31 PM 6/30/2005, Jules Gosnell wrote:
>I thought that it was high time that I brought you up to date with my
>efforts in building a clustering layer for Geronimo.
>The project,, started as an effort to build a
>scalable clustered HttpSession implementation, but in doing this, I
>have built components that should be useful in clustering the state
>held in any tier of Geronimo e.g. OpenEJB SFSBs etc.
>WADI (Web Application Distribution Infrastructure) has two main
>elements - the vertical and the horizontal.
>Vertically, WADI comprises a stack of pluggable stores. Each store has
>a pluggable Evicter responsible for demoting aging Sessions
>downwards. Requests arriving at the container are fed into the top of
>the stack and progress downwards, until their corresponding Session is
>found and promoted to the top, where the request is correctly rendered
>in its presence.
>Typically the top-level store is in Memory. Aging Sessions are demoted
>downwards onto exclusively owned LocalDisc. The bottom-most store is a
>database shared between all nodes in the Cluster. The first node
>joining the Cluster promotes all Sessions from the database into
>exclusively-owned store - e.g. LocalDisc. The last node to leave the
>Cluster demotes all Sessions down back into the database.
>Horizontally, all nodes in a WADI Cluster are connected (p2p) via a
>Clustered Store component within this stack. This typically sits at
>the boundary between exclusive and shared Stores. As requests fall
>through the stack, looking for their corresponding Session they arrive
>at the Clustered store, where, if the Session is present anywhere in
>the Cluster, its location may be learnt. At this point, the Session
>may be migrated in, underneath the incoming request, or, if its
>current location is considered advantageous, the request may be
>proxied or redirected to its remote location. As a node leaves the
>Cluster, all its Sessions are evacuated to other nodes via this store,
>so that they may continue to be actively maintained.
>The space in which Session ids are allocated is divided into a fixed
>number of Buckets. This number should be large enough such that
>management of the Buckets may be divided between all nodes in the
>Cluster roughly evenly. As nodes leave and join the Cluster, a single
>node, the Coordinator, is responsible for re-Bucketing the Cluster -
>i.e. reorganising who manages which Buckets and ensuring the safe
>transfer of the minimum number of Buckets to implement the new
>layout. The Coordinator is elected via a Pluggable policy. If the
>Coordinator leaves or fails, a new one is elected. If a node leaves or
>joins, buckets emigrate from it or immigrate into it, under the
>control of the Coordinator, to/from the rest of the Cluster.
>A Session may be efficiently mapped to a Bucket by simply %-ing its
>ID's hashcode() by the number of Buckets in the Cluster.
>A Bucket is a map of SessionID:Location, kept up to date with the
>Location of every Session in the Cluster, of which the id falls into
>its space. i.e. as Sessions are created, destroyed or moved around the
>Cluster notifications are sent to the node managing the relevant
>Bucket, informing it of the change.
>In this way, if a node receives a request for a Session which it does
>not own locally, it may pass a message to it, in a maximum of
>typically two hops, by sending the message to the Bucket owner, who
>then does a local lookup of the Sessions actual location and forwards
>the message to it. If Session and Bucket can be colocated, this can
>reduced to a single hop.
>Thus, WADI provides a fixed and scalable substrate over the more fluid
>arrangement that Cluster membership comprises, on top of which further
>Clustered services may be built.
>The above functionality exists in WADI CVS and I am currently working
>on hardening it to the point that I would consider it production
>strength. I will then consider the addition of some form of state
>replication, so that, even with the catastrophic failure of a member
>node, no state is lost from the Cluster.
>I plan to begin integrating WADI with Geronimo as soon as a certified
>1.0-based release starts to settle down. Certification is the most
>immediate goal and clustering is not really part of the spec, so I
>think it best to satisfy the first before starting on subsequent
>Further down the road we need to consider the unification of session
>id spaces used by e.g. the web and ejb tiers and introduction of an
>ApplicationSession abstraction - an object which encapsulates all
>e.g. web and ejb sessions associated with a particular client talking
>to a particular application. This will allow WADI to maintain the
>colocation of associated state, whilst moving and replicating it
>around the Cluster.
>If anyone would like to know more about WADI, please feel free to ask
>me questions here on geronimo-dev or on wadi-dev.
>Thanks for listening,

View raw message