geronimo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jules Gosnell <ju...@coredevelopers.net>
Subject Re: [wadi-dev] Session/clustering API and the web tier
Date Thu, 13 Jul 2006 10:38:18 GMT
James Strachan wrote:
> On 7/12/06, Jules Gosnell <jules@coredevelopers.net> wrote:
> 
>> Greg Wilkins wrote:
>> > All,
>> >
>> >
>> > Here are my comments on the Session API that were promised after 
>> apachecon dublin.
>> > This is also CC'd to the wadi list and some of the points are 
>> relevant to them
>> > as well.
>> >
>> > My own reason for focusing on the Session API when I think about 
>> clustering,
>> > is that I like the idea of pluggable clustering implementations.   
>> Clustering
>> > is not one size fits all and solutions can go from non-replicated 
>> nodes configured
>> > in a flat file to auto discovered, self healing, redundant hierarchies.
> 
> 
> Agreed. We should be focussed purely on what is the contract between a
> container and the session API and making that contract as simple and
> abstract as is possible while minimising leaky abstractions.
> 
> 
>> > I think the previous discussions we had on this were making good 
>> progress, but
>> > I think we ran out of steam before the API was improved etc.  So I 
>> think it
>> > worthwhile to re-read the original threads.
>> >
>> > But I will repeat my main unresolved concerns here:
>> > While I appreciate the keep-it-simple-stupid approach adopted by the 
>> proposed
>> > session API, I remain concerned that it may over simplify and may 
>> also mix concerns.
>> >
>> > However, I do think that the API is pitched at about the right level 
>> - namely
>> > that it is below the specific concerns of things such as HTTP.  As 
>> the implementor
>> > of the web container, I would prefer to not delegate HttpSession 
>> management
>> > or request proxying to a pluggable session implementation (I doubt that
>> > a cluster impl wants to deal with non-blocking proxying of requests 
>> etc.)
>>
>> I think that our discussions about this have suffered from an ambiguity
>> around the word 'delegate'...
>>
>> In one sense of the word, given WADI's current implementation, Jetty
>> does delegate Session management and HTTP handling to WADI, in that WADI
>> passes the WebApp/Jetty an object on which it calls a method and the
>> work in question is done.
>>
>> However, in another sense, Jetty need not delegate this task, since the
>> object returned in these cases is managed by WADI, but created by a
>> Factory that is injected at startup time. This factory might be
>> generating instances of a class that has very Jetty-specific knowledge
>> or is even a part of the Jetty distro...
> 
> 
> Thats certainly one approach. Another is for the container to just ask
> the policy API what to do (i.e. is the request going to be serviced
> locally or not) so that the container can take care of the rest.

This leaks clustering concerns into the container's space.

> 
> I understand the cleanliness from the session API implementor's
> perspective of using a factory and calling back the container when you
> see fit - however I also understand the container developers
> requirement to understand at all times what each thread is doing, to
> tune things aggressively with full knowledge of threading models and
> to generally be master of its own domain, so I can understand why a
> container developer might prefer a non-callback related solution
> (which could introduce all kinds of nasty thread related bugs into the
> container).

Any clustering solution will use threads underneath its API. If this is 
a concern you should simply make explicit where they may be used.

> I don't see why both options can't be offered.
> 
> 
>> I would wholeheartedly agree that the code for Http request relocation
>> should be written by someone with expertise in that area - namely the
>> container writer. I would just rather see it injected into the clustered
>> manager, so that it can be called when required, without having to
>> burden Jetty with the added task of making this decision itself.
> 
> 
> I don't see that as mutually exclusive. Just have a way for Jetty to
> ask the clustering solution if a request can be satisfied locally, if
> not Jetty does the proxy/redirect thing.
> 
> 
>> > I see that the webcontainer needs to interact with the cluster 
>> implementation
>> > in 4 areas:
>> >
>> >
>> > 1) Policy
>> > ---------
>> >
>> > When a container receives a request, it needs to make a policy 
>> decision along
>> > the lines of:
>> >
>> >     1) The request can be handled locally.
>> >     2) The request can be handled locally, but only after some other 
>> actions
>> >        (eg session is moved to local)
>> >     3) request cannot be handled locally, but can be redirected to 
>> another node
>> >     4) request cannot be handled locally, but can be proxied to 
>> another node.
>> >
>> > This kind of corresponds to the Locator and SessionLocation APIs.  
>> However
>> > these APIs give the power to enact a policy decision, but give no 
>> support to make
>> > a policy decision.
>> >
>> > To implement a policy, you might want to use:  the size of the 
>> cluster, the total
>> > number of sessions, the number of  session on the local node, the 
>> number of sessions
>> > collocated with a remote session, how many requests for the session 
>> have recently
>> > arrived on what nodes, etc. etc.
>> >
>> > The API does not give me this information and I think it would be
>> > difficult to provide all that might be used.  Potentially
>> > we could get by with a mechanism to store/access cluster wide meta-data
>> > attributes?
>> >
>> > However, it is very unlikely that one policy will fit all, so each 
>> consumer
>> > of this Location API will have to implement a pluggable policy frame 
>> work of
>> > some sorts.
>> >
>> > But as the session API is already a pluggable framework, why don't we
>> > just delegate the policy decision to the API.  The web container
>> > should make the policy decision, but should call the session API to
>> > make the decision.  Something like:
>> >
>> >   SessionLocation executeAt =  
>> locator.getSessionExecutionLocation(clientId);
>> >   if (executeAt.isLocal())
>> >     // handle request
>> >   else
>> >     // proxy or redirect to executeAt location.
>> >
>> > (Note the need for something like this has been discussed before and
>> > generally agreed.  I have seen the proposed RemoteSessionStrategy, 
>> but I am not
>> > sure how you obtain a handle to one - nor do I think the policy should
>> > decide between redirect and proxy - which is HTTP business).
> 
> 
> Agreed. Just some way to ask the Session API if a request can be
> processed locally might do the trick, then if not Jetty can do its
> proxy/redirect thing. The trickier thing is what to pass into the
> strategy to help it decide...
> 
> 

by having Jetty make the decision:

- you leak clustering concerns into the web tier
- you have to duplicate similar code in every clustered tier

>> By exposing the 'policy' api to the container and putting it in charge
>> of when it used, you are exposing clustering details to it.
> 
> 
> Also the container details may be required by this policy. e.g.
> details about the previous http requests received at the current node,
> their type and various metadata statitsics and so forth which only the
> container is aware of.
> 
> 

sophisticated policies require access to both container and clustered 
session manager details on which to make an informed decision.

there are two places that you can abstract -

a) an api over the necessary details in the container
b) an api over the necessary details in the clustered session manager

if you go with (a), then each session manager can provide a single 
policy which will run on any container.

if you go with (b), then you each container will have to implement its 
own policy code that will run on any session manager.

since the number of containers in the equation is always likely to 
outnumber the number of clustered session manager implementations (a) 
will allow for the most code reuse - WADI takes this approach, to get 
something going with minimum code, whilst not shutting the door on the 
plugging in of policies which use native APIs on both sides, thus 
allowing maximum sophistication.

taking the (b) route, will allow different tiers to use different logic 
to decide where to locate their session. This is a bad idea because :

1) tier owners are not clustering architects - once again we have the 
leakage of concerns.

2) this opens us to the possibility of different tiers making 
contradictory decisions and session-groups (e.g. a web and ejb session) 
being ping-ponged back and forth within the cluster because (e.g. the 
web and ejb) containers are using different logic to decide the best 
place to keep their session.


Ultimately you could abstract on both sides of the coin - but I think 
that you would over constrain the policy's input and don't see much 
value in a policy that would port between different clustered session 
managers as their implementations are likely to be very different.

>> WADI's approach is to completely shield the container from having to
>> know anything about clustering, whilst maintaining contracts with the
>> container encapsulating various pieces of tier/domain-specific
>> functionality that may be injected into the clustered session manager.
> 
> 
> The issue is though, how invisible can clustering ever be? Information
> from the container and from the clustering implementation will
> typically be required for the policy decision.
> 
> 
>> > 3) Life cycle
>> >
>> > Unfortunately the life and death of a session is not simple - specially
>> > when cross context dispatch is considered.  Session ID's may or may not
>> > be reused, their uniqueness might need to be guarenteed and the 
>> decision
>> > may depend on the existence of the same session ID in other contexts.
>> >
>> > I think this can be modeled with a structured name space - so perhaps
>> > this is not an issue anymore?
>> >
>> >
>> > 4) Configuration and Management
>> > It would generally be good to be able to know how many nodes are in
>> > the cluster (or to set what the nodes are). To be able to monitor node
>> > status and give commands to gracefully or brutally shutdown a node, 
>> move
>> > sessions etc.
>> >
>> > Clustering aware clients (JNDI stubs, EJB proxies or potentially fancy
>> > Ajax web clients) might need to be passed a list of known nodes - 
>> but it
>> > is not possible to obtain/set that from the API - thus every impl 
>> will need
>> > to implement it's own cluster config/discover even if that 
>> information is
>> > available in other implementations.
>> >
>> >
>>
>> This is the clustering API (in my mind) that was mooted in the meeting.
>> A number of clustering substrates (JGroups, ActiveCluster, Tribes,
>> etc...) have homesteaded this area (WADI maintains an abstraction layer
>> that can map on to any of these three). All provide an API which
>> provides membership notification/querying, 1->1 and 1->all messaging
>> functionality. These are the basic building blocks of clustering and
>> they will be required in every clustered service that is built for
>> Geronimo. This is a natural candidate for encapsulation and sharing.
>> Failing to do this will result in each different service having to build
>> its own concepts about clustering from the ground up, which would be a
>> disaster.
> 
> 
> Agreed.
> 
> Things in the Java world have changed greatly since the introduction
> of JGroups, JCluster, ActiveCluster, Tribes et al. Nowadays there is
> no reason why we can't have a really simple POJO based model to
> represent Nodes in a cluster with listeners to be notified when nodes
> come and go. (Its really the main point of ActiveCluster - but we
> could maybe refactor that API to be just a POJO model of a cluster
> with no dependencies on external APIs or technologies and with the
> ability maybe to cast a Node to some service interface to communicate
> with the nodes).
> 
> Then using things like Spring Remoting we can add the remoting
> technology as a deployment issue (rather than having lots of different
> middleware specific APIs). e.g. see how Lingo allows you to invisibly
> add JMS remoting to any POJO. (http://lingo.codehaus.org/)
> 

This is an interesting thought. I'll let it soak in for a while.


Jules


-- 
"Open Source is a self-assembling organism. You dangle a piece of
string into a super-saturated solution and a whole operating-system
crystallises out around it."

/**********************************
  * Jules Gosnell
  * Partner
  * Core Developers Network (Europe)
  *
  *    www.coredevelopers.net
  *
  * Open Source Training & Support.
  **********************************/

Mime
View raw message