geronimo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jules Gosnell <>
Subject Re: sandbox/messaging - your feedbacks are welcome
Date Sat, 24 Jul 2004 00:37:42 GMT

Your pluggable topology stuff is duplicating work that is going on in 
WADI ( WADI is being built top-down from 
Tomcat/Jetty, and we already have a full, albeit not yet replicating, 
session manager for both platforms. Discussion has also been taking 
place between WADI and OpenEJB as to how best share code for common 
session management requirements.

Is there some way that we could put our heads together and come up with 
a workable solution that satisfies both our criteria and allows us to 
work together, rather than apart ? I would welcome input on WADI's 
topology management as I am sure James would on ActiveCluster...

Briefly :

WADI sits on top of ActiveCluster, which sits on top of ActiveMQ.

ActiveCluster is responsible for raising join/leave notifications 
concerning cluster membership.

WADI uses these notifications to recalculate the set of peers division 
into cells (subsets of peers) of configurable size.

I currently have Ring and NChooseK schemes up and running.

Each peer will carry a stack of weighting factors, which will be 
combined in each cell.

Each client requiring the storage of state will address a facade for the 
cells, which will divide the state according to weighting factors, 
amongst them. Various algorithms for the movement of state on the 
occurrence of client or cell-member death have been figured out. We are 
looking at solutions for surviving network split.

There is a thread on TSS at the moment that may be of interest 
( and 
various documentation at the WADI site and in CVS (Could be better).

WADI is an external project, since it is already useful to standalone 
Jetty and Tomcat users, but now that we have agreed to work with OpenEJB 
to share common code, it will probably be pushed back into the Geronimo 
respository in order to underpin both projects.

Please take a look at WADI, ActiveCluster, ActiveMQ etc and let us know 
what you think...


Jules wrote:

> For clustering, we've been working quite heavily for some time on this 
> abstraction...
> (Note that ActiveCluster is not Geronimo specific and so can be used 
> to build clusters or anything).
> The current implementation works on top of any JMS provider, such as 
> ActiveMQ, which can work over UDP, multicast, TCP, SSL, g-network, 
> JGroups, JXTA etc.
> Jules has been working hard on distributed session state and handling 
> fail-over gracefully and cluster wide topology organisation protocols 
> such as for arranging buddies over subnets / DR zones and the like 
> using WADI
> which is using ActiveCluster and Jules is starting to put together 
> various algorithms for choosing buddies, pairs, sub-nets, controllers 
> and the like.
> Notice the simpler API for ActiveCluster which just reuses a few 
> interfaces from JMS.
> It seems your new messaging.cluster API is pretty similar to 
> ActiveCluster. Any ideas why you didn't just use ActiveCluster? 
> (Especially as I mentioned it to you quite a while ago :)
> Also, as I said to you a while ago, I don't see why the messaging 
> package doesn't use the JMS API for things like Msg / MsgBody / 
> MsgConsumer / MsgProducer and so forth. Not only would this mean your 
> API would become more J2EE standard, it'd mean you could reuse heaps 
> of open source and commercial implementations.
> On 20 Jul 2004, at 05:07, Gianny Damour wrote:
>> Hello,
>> I am working on a prototype, sandbox/messaging, focused on providing 
>> the infrastructure for the implementation of clustered applications. 
>> This proto has reached a stage, which is according to me "good 
>> enough" for judgment.
>> I will try to describe here the main features of this infrastructure; 
>> hence, this memo will be a little bit long.
>> Its core ideas are:
>> - to provide a mechanism to cluster/inter-connect N Geronimo servers. 
>> The way these servers are inter-connected should be at the same time 
>> manageable (e.g. I want this server to be connected to this one) and 
>> to some extent automatic (e.g. when a new server is detected, it 
>> should be added automatically to the cluster); and
>> - to provide a set of base services built on top of the above 
>> infrastructure to simplify the implementation of clustered 
>> applications (e.g. creation of proxies for services running on remote 
>> Geronimo server).
>> Let's talk in more details about the way Geronimo servers are 
>> clustered. The implementation achieve this goal by organizing servers 
>> in a known and configurable topology, e.g. star, ring, hyper-cube, 
>> where edges of the associated graphs represent connections. At the 
>> very beginning, a server and two heartbeat services, namely heartbeat 
>> sender and heartbeat monitor, are started. The heartbeat sender sends 
>> periodically an heartbeat consisting of the meta-data (IP address, 
>> port and name) of its associated server to a multicast group. The 
>> heartbeat monitor monitors these heartbeats and detects the 
>> availability or failure of servers. When a new server is available or 
>> a failure is detected, a new topology is computed and cascaded to the 
>> servers of the current topology.
>> Let's consider the following scenario:
>> Geronimo servers are organized in a ring topology; four servers are 
>> started and one server is killed.
>> 1. starts the first server, namely LearderNode. As it is the first 
>> server, it is in a stand-alone mode;
>> 2. starts the second server, namely Node1. This server is detected by 
>> LeaderNode, which triggers a reconfiguration. The topology is 
>> LeaderNode -- Node1 -- LeaderNode;
>> 3. starts the third server, namely Node2. LeaderNode inserts Node2 
>> between itself and Node1. The topology is LeaderNode -- Node1 -- 
>> Node2 -- LeaderNode;
>> 4. starts a fourth server, namely Node3. Detected by LeaderNode, it 
>> inserts Node3 between itself and Node2. The topology is LeaderNode -- 
>> Node1 -- Node2 -- Node3 -- LeaderNode; and
>> 5. stops Node2. LeaderNode drops it from the ring. The topology is 
>> LeaderNode -- Node1 -- Node3 -- LeaderNode.
>> As the proto supports the ring topology, it is possible to trial this 
>> scenario:
>> cd sandbox/messaging
>> maven (ClusterHBTest may fail, so ignore the test failures if required)
>> maven -patch
>> cd ../..
>> java -jar target/bin/server.jar org/apache/geronimo/LeaderCluster
>> java -jar target/bin/server-1101.jar org/apache/geronimo/Cluster8091
>> java -jar target/bin/server-1102.jar org/apache/geronimo/Cluster8092
>> java -jar target/bin/server-1103.jar org/apache/geronimo/Cluster8093
>> kill <the process java -jar target/bin/server-1102.jar 
>> org/apache/geronimo/Cluster8092>
>> As a conclusion, this prototype tries to federate Geronimo servers in 
>> specific topologies. As an aside, it is rather simple to support 
>> other kinds of topologies without significant efforts. For instance, 
>> one of the JUnit test (NodeImplTest)  uses a bus topology.
>> Based on the knowledge of the enforced topology,  it should be 
>> possible to implement "efficient" clustered applications. For 
>> instance, the replication of Web sessions could work as follow: 
>> replicate the sessions created on this server to all of its direct 
>> neighbours (neighbours can be easily retrieved via a topology). This 
>> way the load is evenly distributed as long as sessions are evenly 
>> created in the cluster.
>> On top of this infrastructure, the proto implements a set of basic 
>> services, which could simplify the implementation of such clustered 
>> applications. These services are:
>> - customization of the marshalling/unmarshalling of Objects to be 
>> sent/received to/from a remote server: it is possible to replace 
>> specific objects;
>> - InputStream can be passed between servers: by leveraging the 
>> previous feature, InputStreams are replaced by a proxy which can be 
>> used to pull the content of an InputStream hosted on a remote server. 
>> This can be useful when dumping the content of a server to another 
>> server in order to initialize its state;
>> - primitive reference layer: Objects implementing a specific 
>> interface can be passed around even if not serializable. For 
>> instance, the current implementation can pass around a MBeanServer 
>> (this is a bad example as JSR 160 is intended for that). If you have 
>> a look to MBeanServerEndPointImpl, you will see that this is actually 
>> the ability to return by reference an object to the remote caller. As 
>> this caller can also provide parameters, which implements this 
>> specific interface, one can achieves a pass by reference for both the 
>> parameters and the result between two servers;
>> - proxy creation: it is the ability to acquire a proxy for a service 
>> running on a remote server:
>> // Defines the proxy meta-data.
>>            EndPointProxyInfo proxyInfo = new EndPointProxyInfo(
>>                NodeEndPointView.NODE_ID, NodeEndPointView.class, 
>> nodeInfo);
>> // Builds the proxy.
>>            NodeEndPointView topologyEndPoint =
>>                (NodeEndPointView) 
>> endPointProxyFactory.factory(proxyInfo);
>> // Transforms the Msgs which will be sent by this proxy.
>>            ((EndPointProxy) topologyEndPoint).setTransformer(new 
>> MsgTransformer() {...});
>> // This call will actually invoke the service on the server nodeInfo.
>>                topologyEndPoint.prepareTopology(aTopology);
>> As an aside, whatever the number of services communication with other 
>> remote services, the number of connections stay low: it is the number 
>> of edges defined by the current topology.
>> This proto has some bugs (e.g. memory leak of the reference layer) 
>> and some enhancements are required (e.g. classloading strategy is to 
>> be added). Nevertheless, I would like to have your inputs about the 
>> general concept and the current state of the implementation prior to 
>> progress any further.
>> Cheers,
>> Gianny
> James
> -------

 * Jules Gosnell
 * Partner
 * Core Developers Network (Europe)

View raw message