geronimo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jules Gosnell <ju...@coredevelopers.net>
Subject Re: Clustering
Date Sat, 15 Oct 2005 17:32:38 GMT
Dave Colasurdo wrote:

>
>
> Jules Gosnell wrote:
>
>> Jeff Genender wrote:
>>
>>> Now that we have achieved the covetted J2EE Certification, we need 
>>> to start thinking about some of the things we will need to have in 
>>> Geronimo in order to be mass adopted by the Enterprise.
>>>
>>> IMHO, I think one of the huge holes is clustering.  This is a heavy 
>>> need by many companies and I believe that until we get a powerful 
>>> clustering solution into G, it will not be taken as a serious J2EE 
>>> contender.
>>>
>>> So, with that said, I wanted to start a discussion thread on 
>>> clustering and what we need to do to get this into Geronimo.  I 
>>> personally would like to be involved in this (thus the reason for me 
>>> starting this thread) - yeah, since Tomcat is done, now I am bored ;-).
>>>
>>> I was going over the lists and emails and had some great discussion 
>>> with Jules on the WADI project he has built.  This seems compelling 
>>> to me.  I also noticed Active Cluster as a possibility.
>>>
>>> So lets start from the top.  Do we use an already available 
>>> clusering engine or do we roll our own?  Here is a small list of 
>>> choices I have reviewed and it is by no means complete...
>>>
>>> 1) WADI
>>> 2) Active Cluster
>>> 3) Leverage the Tomcat Clustering engine
>>>
>>> So here are some of my questions...
>>>
>>> How complete is WADI and Active Cluster?  Both look interesting to 
>>> me. My only concern with Active Cluster is it seems to be JMS based, 
>>> which I think may be slow for high performance clustering (am I 
>>> incorrect on this?).  How mature is WADI?
>>
>>
>>
>> Here is a status report on WADI.
>>
>> I'm developing it full time.
>>
>> A snapshot is available at wadi.codehaus.org - documentation is in 
>> the wiki - at the moment the documentation (rather minimalist) is 
>> more up to date than the snapshot, but I will try to get a fresh one 
>> out next week.
>>
>> WADI is a plugin HttpSession Manager replacement for Tomcat-5.0/5.5 
>> and Jetty-5.1/60 (it can actually migrate sessions between all four 
>> in the same cluster).
>> It comprises a vertical stack of pluggable caches/stores (memory, 
>> local disc, db etc) through which sessions are demoted as they age 
>> and promoted as and when required to service a request.
>
>
> Can you please clarify the purpose of promotion/demotion of 
> httpsessions? Is this a mechanism to age old entries out of the cache?

I envisage a typically configured stack to look like this : 
memory<->localDisc<->cluster<->db.

The db is only used to load sessions if you are the first cluster member 
to start, or to store them if you are the last cluster member to stop.
The cluster store gives you access to the sessions held on every other 
node in the cluster (more about this later)
The localDisc is where sessions are paged out to by a pluggable eviction 
strategy running in the memory store (currently based on inactivity, but 
could take into account number of sessions in mem).
Memory is where sessions and requests are combined in the rendering of 
pages.

A request enters the top of the stack and travels downwards towards the 
cluster store, until its session is found (or not), at which point the 
session is promoted into memory and the request rendered. The session 
will stay in memory, until evicted downwards, explicitly invalidated, or 
(if the eviction strategy is e.g. NeverEvict) implicitly invalidated due 
to time out.

> How does this relate to httpsession inactivity timeouts?

Orthogonally. i thought about pushing timed out sessions into another 
store so that they could be data-mined, but figured that once a session 
had been destroyed (all sorts of listeners might have fired) it would be 
asking for trouble to try to serialise it. If the application wanted to 
keep an archived copy, it could do it via one of these listseners. The 
evicters are just there to spool stuff out onto e.g. disc, so that you 
can hold larger sessions for more clients.

> Is the cache size configurable?
>
It would be up to the evicter to use the number of local entries in its 
eviction algorithm. I don't have an evicter that does this currently, 
but I don't think it would be hard to write one.

>> This stack may be connected horizontally to a cluster by inserting a 
>> clustered store, which uses a distributed hash table (currently 
>> un-replicated, but I am working on it) to share state around the 
>> clusters members in a scalable manner. WADI has a working mod_jk 
>> integration.
>
>
> Does this mean that each cluster member shares it's httpsession data 
> with all of the other members (1-> all) or is there the notion of 
> limiting the httpsession replication to one (or a few) designated 
> partners?

This is the most interesting and challenging part of WADI. I learnt from 
my early experiences with httpsession distribution that 1->all 
replication is simply a no-go. The point of having a large number of 
nodes in a cluster is that your availablility becomes 
average-node-availability^number-of-nodes, not that your architecture 
forces you to partition your cluster into n/2 "micro-clusters", reducing 
your availability to average-node-availability^2.

There are two distinct issues to deal with - location and replication.

I have solved the first issue (althought the code is not ready for prime 
time yet). The cluster has a fixed number of 'buckets'. Responsibility 
for these buckets is divided between the cluster members, and redivided 
on membership changes. Each bucket contains a map of 
session-id:location. A Session's id is used to map it to a bucket. 
Sessions are free to live on any node in the cluster. If a session is 
created/destroyed/migrated, its bucket owner is notified. Requests are 
expected to fall 99/100 on the node holding the relevant session, so, in 
this case, everything will hapen in-vm. Occasionally, due to node 
maintenance, load-balancer confusion etc, requests will fall elsewhere. 
In this case the receiving node can ask the bucket owner for the 
session's location and either redirect/proxy the request to the session, 
or migrate the session in underneath the incoming request. Since only 
one node needs to be informed of a session's location, migrating a 
session does not need to involve notification to every cluster member of 
the new location.

I am working on replication at the moment - here is what I envisage - 
every session will be implemented via a master/primary and a number (n) 
slaves/secondaries. I expect n to usually be 1-2. Slaves will be 
notified of changes to the master either synchronously or asynchronously 
at some point (another pluggable strategy) after they occur. The master 
and bucket owner will know the location of master and slaves. Death of 
the master will result in a slave being promoted to master and another 
slave being recruited. If a request should land on its sessions slave, 
rather than migrate  the session from its master and then find you have 
to recruit a new slave (to avoid having master and slave colocated), 
slave and master may just arrange with bucket owner to swap roles.

This actually just describes in-vm replication, which I hope will be a 
single replication backend. Other backends may include e.g. backup to a 
db etc.

>
>
>>
>> WADI currently sits on top of ActiveCluster, which it uses for 
>> membership notification and ActiveMQ which is used for transport by 
>> both layers. ActiveMQ has pluggable protocols, including a peer:// 
>> protocol which allows peers to talk directly to one another (this 
>> should put to bed fears of a JMS based solution not scaling - 
>> remember, JMS is just an API). So you do not need to choose between 
>> WADI and ActiveCluster - they are complimentary. ActiveCluster can 
>> also (I believe) use JGroups as a transport - I haven't tried it.
>>
>> ActiveSpace is another technology in this area (distributed caching) 
>> and it looks as if WADI and ActiveSpace will become more closely 
>> aligned. So this may also be considered a complimentary technology.
>>
>> Both Tomcat and Jetty currently have existing clustering solutions. I 
>> looked closely at the Tomcat solutions before starting out on WADI 
>> and knew all about the Jetty solution, because I wrote it :-). WADI 
>> is my answer to what I see as shortcomings in all of the existing 
>> open source approaches to this problem-space.
>>
> Can you provide a quick high level description of the advantages of 
> WADI over Tomcat and Jetty clustering solutions?

Jetty uses 1->all replication over jgroups, as I believe 1 Tomcat 
session manager does. I think the other Tomcat session manager also does 
1->all replication, but over its own protocol. Perhaps Jeff can confirm 
this. I think TC's 'PersistentManager' is also able to write changed 
sessions out to disc at the end of the request.

1->all, for the reasons given above will not scale. The more nodes you 
add, the more notifications each will have to react to and the more 
sessions it will have to hold. you are simply deferring your problems 
for a little while. Your only way out is to partition cluster and 
sacrifice your availability. When WADI's in-vm replication strategy is 
finished, I think that this will make it a clear winner for anyone 
wishing to cluster more than  2-3 nodes.

WADI, is also, to my knowledge, the only open source session manager to 
really resolve concurrency and serialisation issues within the 
httpsession properly. You cannot serialise a session safely until you 
are sure that no user request is running through it. You (probably) 
cannot migrate or replicate it without serialisation. WADI uses locking 
policies to ensure that container threads performing housekeeping and 
serialisation cannot collide with application/request threads modifying 
the same object. Jeff, are you aware of anything in TC which does the 
same thing ? I think that they may keep some count of the number of 
request threads active on a session, but last time I looked, I could not 
find code that looked like it was checking this before attempting 
serialisation or invalidation

>
>
>> Some parts of WADI should soon (December) be undergoing some serious 
>> testing. When they pass we will be able to consider them production 
>> ready. Others, notably the distributed hash table are still under 
>> development (although a fairly functional version is available in the 
>> SNAPSHOT).
>>
>> I think that, in the same way Tomcat clustering could be enabled 
>> easily in Geronimo, WADI could also be added by virtue of its 
>> integration with Tomcat/Jetty, but I have been concentrating on my 
>> distributed hash table too hard. If anyone is interested in talking 
>> further about WADI, perhaps trying to plug it into Geronimo (It is 
>> spring-wired and uses spring to register its components with JMX. I 
>> guess it should be simple to hook it into the Geronimo kernel in the 
>> same way, I just haven't had the time), or helping out in any way at 
>> all, I would be delighted to hear from them.
>>
>> I have broached the subject of a common session clustering framework 
>> with members of the OpenEJB team and we have discussed things such as 
>> the colocation of HttpSessions and SFSBs. I believe OpenEJB has been 
>> moving towards JCache to facilitate the plugging in of a clustering 
>> substrate. My distributed hash table is also moving in the same 
>> direction.
>>
> So, if I understand correctly, you are working towards some common 
> infrastructure with openejb.. though WADI itself, will not address 
> clustering beyond the Web Tier?

We've had preliminary discussions. I guess, depending on how much WADI 
infrastructure was of interest to OpenEJB, that I would look at 
genericising core pieces so that they could deal with SFSBs as well as 
HttpSessions. In fact, most of the code already deals with a more 
generic abstraction which corresponds roughly to a JCache CacheEntry, so 
this should not be hard. Many of the issues faced in the SFSB clustering 
world are mirrored in the HttpSession world, except that whilst an 
intelligent client-side proxy can solve a lot of location issues for 
your SFSB, HttpSessions have to rely on slightly less intelligent e.g. 
h/w load-balancers...

There are also interesting issues arising from the integration of 
clustered web and ejb tiers, such as the need to colocate httpsessions 
and SFSBs. I have been discussing the possibility of having an 
ApplicationSession object which can house a number of web (portlet spec 
complicates this) and ejb sessions, so that if one migrates, they all 
end up  on the new node together. If we don't have something like this 
in place, your application components may end up scattered all over the 
cluster.

>
> Thanks for the update!
>
Your welcome,


Jules

>> I hope that gives you all a little more information to go on. If you 
>> have any questions, just fire away,
>>
>>
>> Jules
>>
>>
>>>
>>> Thoughts and opinions are welcomed.
>>>
>>> Jeff
>>
>>
>>
>>
>>


-- 
"Open Source is a self-assembling organism. You dangle a piece of
string into a super-saturated solution and a whole operating-system
crystallises out around it."

/**********************************
 * Jules Gosnell
 * Partner
 * Core Developers Network (Europe)
 *
 *    www.coredevelopers.net
 *
 * Open Source Training & Support.
 **********************************/


Mime
View raw message