avalon-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Karasulu" <aok...@bellsouth.net>
Subject RE: Question: scalability, stability and multiple JVMs with Merlin?
Date Tue, 27 Apr 2004 14:20:06 GMT

> -----Original Message-----
> From: Nader Aeinehchi [mailto:nader@aeinehchi.com]
> Sent: Monday, April 26, 2004 3:42 PM
> To: Avalon Developers List
> Subject: Re: Question: scalability, stability and multiple JVMs with
> Merlin?
> Hi Alex
> > For the time being I myself am experimenting with things like
> > JGroups for state propagation.
> I had never heard of JGroups before I read your email.  I will try to read
> about JGroups.
> I am by no means an expert in this field, but I can mention some issues
> and
> techniques.
> ============================================================
> 1. A clear defintion of clustering should be made.  Is cluster a group of
> servers (in Merlin containers) that cooperatively achieve a common goal?
> Or is cluster a a group of identical twin servers that can automatically
> take over each other without loss of state?  There exist several very
> different definitions in the area.
> A clear definition of "cluster" establishes the ground for further
> analysis
> and work.

I would view a cluster as a group of components.

> ============================================================
> 2. Stateless versus stateful should be addressed.
> Usually it is easier to develop stateless clustering systems than stateful
> clustering.
> To my knowledge, stateful clustering is pretty complex.

Yeah stateless clustering is just remoting to me - but stateful will not be

> --------------------------------------------------------------------------
> --
> --------------------------------
> a.  Stateless clustering
> I would actually call it loadbalancing rather than clustering.
> At best, such systems are not adequate for critical environments, like
> Bank
> & Finance where no information must ever be lost in case of failure of a
> server node.
> The most common techniques is Round Robin with several variants.
> Many of the systems, I am aware of, actually provide what is called
> "Loadbalancing with Session Binding" or called "Loadbalancing with Sticky
> Session".
> E.g. I saw the proposal in Apache Geronimo where it was suggested.  Most
> of
> the EJB servers provide some sort of loadbalancing.

Right we should be looking at how others are achieving this.

> Some of systems move the burden of clustering over to the client code
> called
> Cluster Aware Proxy.  Here the client stores and caches enough information
> in order to continue the session with a second server if the first one
> dies.
> In WebLogic documentation, I read something that seemed to be in this
> category, but I am not sure.
> Some of the systems provide an automatic discovery and recovery of a
> Service
> Provider.  However, there is no guarantee that the state is replicated
> among
> service providers of the same kind.
> I think Jini falls into this category.  To my limited knowledge, Grid
> Computing falls into this category too.
> --------------------------------------------------------------------------
> --
> --------------------------------
> b. Stateful clustering
> The second group of clustering systems provide some sort of
> synchronization
> among different cluster nodes (servers).
> To my limited knowledge, following techniques exist:
> I. Multicast replication among various nodes.
> E.g. in Oracle OC4J (Orion)

That's what JGroups can be used for and it is a multicast technology.

> II. Each node persists each state in a database and thus a shared state of
> the cluster always exists.
> The are several issues here.  Firstly, performance.  Secondly, who is
> going
> to write the persistence code, the container or the application developer?

Hopefully the container manages most of this but hooks will be needed 
for customization.  The level of work the component developer would 
like to do should be up to them.  Like EJB developers when they use 
container managed persistence or bean managed persistence.

> Thirdly, the database itself could be a single point of failure something
> that could be solved using clustering/replication of the database itself.
> E.g. I think WebSphere is using this technique.

We can use a simple distributed database based on some DBM like technology 
that itself is replicated.  For example this is what I am using for Eve.  

> III. Each node reads and writes from a shared memory like (Linda Spaces,
> IBM's TSpace and Sun's JavaSpaces).  This looks like II, but uses shared
> memory.
> Here the shared memory itself could rely on a persistent manager like a
> database.  Like in the above item, the shared memory could itself be a
> single point of failure and should be replicable to another shared memory.
> I heard that JBoss was using JavaSpaces.

I know nothing about JavaSpaces and would have to look at it in depth to
be able to respond.

> IV.  Mobile Code
> Here there are two categories: Weak Migration and Strong Migration.
> In weak migration, the mobile code itself is responsible to restore its
> state when it arrives at another container.  In the strong migration, the
> container is responsible to save and restore the state as the code
> migrates
> from one container to another.
> Mobile Code has been a hot field within Mobile Agents, and probably some
> of
> the techniques could be used for the purpose of clustering.

I don't think this is required.  This is more a mobile agent concept 
that is not required for clustering.  But in itself a very interesting 
concept to pursue.

> V. There exist many other techniques like operating system clustering and
> hardware clustering.  Here there are many techniques and strategies.  Just
> to mention few examples, Linux clustering, MS Windows Clustering farms,
> Solaris etc.

We should research these and make a matrix of sorts where we can weigh 
the pros and cons of each technique then either borrow or build upon an
existing mechanism.

> ============================================================
> 3. One other important aspect is to define the scope and granularity of
> cluster.
> Consider that nodes in a magical way synchronize with each other.  The
> question is how often and where do they synchronize?  Should
> synchronization
> happen when an object is changed (a local variable changes), when a method
> call is finished, when a transaction finishes or ....?  Of course, the
> more
> granularity,  more stability and more performance penalty.

These are very important semantics indeed and a single email 
cannot possibly be significant.  However you're asking all the
right questions that need to be answered.  Again we need to 
look at whether or not we're clustering at the level of a container
or at the level of a component.  

> ============================================================
> 4. Another important question is whether the cluster should handle any and
> every type of object or whether only certain types of objects should be
> supported.
> Should cluster-enabled objects be run in any and every container, or
> should
> only certain types of containers provide such funtionality?  The last
> question is co-related with the aspect of security.
> Cluster-enabled containers embedded within general purpose containers
> might
> also be a way to go.  The benefit is that as long the parent non-cluster
> container lives, a sort of state persistance and code migration (mobile
> code) strategy could be adopted.  I have not studied this area, so I leave
> it to further investigation.

The fact that a container is a component and can be embedded within 
another container leads to an explosion of questions that need to 
be answered.  I don't have any answers for you here.  All I can do or
say is we need to discuss each of these issues one by one.  

> ============================================================
>  5.  Life Style of components as Merlin defines is also very important to
> address in this regard.  How should the cluster-enabled container handle
> pools and singletons?  How should threads and synchronized code be
> treated?

Ok you have me totally depressed and thinking this is near impossible
now :-(.  There's a lot here.  However the key is to divide and conquer 
each issue and take the simplest case for each at first and work up to
more complex scenarios.  For the LifeStyle aspect take singletons for
example and cluster it so there "appears" to be one centralized singleton
when in fact several components in different containers and JVMs are 
providing the service for that singleton.  The singleton concept means 
that all replicas must be in synch to have the same state.  But you see
what I mean about taking a simple case and working it through.

If you look at all this at one time you'll just want to go back to
bed thinking its impossible.


To unsubscribe, e-mail: dev-unsubscribe@avalon.apache.org
For additional commands, e-mail: dev-help@avalon.apache.org

View raw message