geronimo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Jeff Genender <jgenen...@apache.org>
Subject Re: gcache implementation ideas[long]
Date Wed, 13 Sep 2006 22:04:41 GMT


David Jencks wrote:
> I'm a complete beginner in clustering.... but I have some questions.
> 
> 
> I can't tell if the master communicates with all slaves or only slave 1,
> then slave 1 communicates with slave2, etc.  My questions are biased a
> bit towards a linked list/chain of slaves, which might well not be what
> you intend.
> 
> Can you describe what happens if:
> - the master goes down

Master goes down, the first slave in order will become the new master.

> - the former master comes back up

Good question.  Do we A) have it return as a master? or B) have it come
in as a slave-n?  Need some input here.

> - slave 1 (or any intermediate slave) goes down.
> - slave 1 (or any intermediate slave) comes back up
> - slave n (last slave) goes down.

All great questions.  I would like feedback here.

> 
>>
>> We then have client component(s) that "plugs in" and communicates with
>> the server.  The configuration for the client should be very light where
>> it will only really be concerned with the master/slave/slave/nth-slave.
>>  In other words, it communicates only with the master.  The master is
>> responsible for "pushing" anything it receives to its slaves and other
>> nodes in the cluster.  The slaves basically look like clients to the
>> master.
>>
> 
> Are you saying that every client maintains the entire shared state?  If
> not, how does a client determine if it needs to fetch some state and how
> much state?

This one I can answer.  This first pass is full replication (not
distributed)...so under this scenario, it will maintain full state.  We
can handle this 2 ways.  We can either have that when the client joins
the cluster, it gets the full tomato...or we can have it retrieve full
state the first time it is asked for the session and it does not have it
(ie. Server does not get a copy of the session unless end users asks it
of that server).  I think the latter would probably be less overhead and
perform a bit better.

> It looks to me as if this relies on every client maintaining an
> identical list of master/slave servers.  How do they do that?  (This
> relates to my first set of questions I think)

Configuration startup on the client machines would contain that
information.  The clients need to be aware of who is master, and who are
the slaves.

> 
> What happens if
> 
> Client 1 maintains communication with master and all slaves throughout,
> and all servers remain running at all times
> Client 2 loses connectivity to master long enough to decide that master
> is dead, and gets it back in time to communicate with slave 1, thus
> telling slave 1 it is the new master?
> 

Great question once again.

A possible solution is that once the slave is being requested data, it
can check with the master to see if it's alive.

>>
>> I think this is a fairly simple implementation, yet fairly robust.
>> Since we are not doing the heart beat and mcast, we cut down on a lot of
>> network traffic.
> 
> Can you describe briefly what they are for?
> 

Sure.  A heart beat implementation is used with multicast (mcast).  Each
server that wants to join a cluster offers a ping every second or few
seconds, that tells everyone else that they are in the cluster.  So they
kind of join each other, creating a channel between each other.  Each
server in the cluster has n-1 connections to each other.  As you can
see, in a short amount of time, you can have a very chatty network.
This type of clustering solution is great for a small number of nodes.

In master/slave, I have a connection between the master and each client
(including the slaves as a client), all without the need for heartbeat.
 This cuts down on the network clutter significantly.

>>
>> Communication will be done by TCPIP sockets and would probably like to
>> use NIO.
>>
>> I would like to see this component be able to run on its own...i.e. no
>> Geronimo needed.  We can build a Geronimo gbean and deployer around it,
>> but I would like to see this component usable in many other areas,
>> including outside of Geronimo.  Open source needs more "free" clustering
>> implementations.  I would like this component to be broken down into 2
>> major categories...server and client.
>>
> 
> What is the difference in functionality between a server and a client?

The server is something that runs the central cache and it literally a
socket server.  This wold be your master or slave machines.  The client
is a simple SPI that communicates with the server. I say SPI because the
API itself is contractual in nature, thus allowing us to plug in other
strategies (i.e. heartbeat, localized cache, distributed, etc).  The
idea here is that the client does not care about implementation, only
using the simple API.

> 
>> After a successful implementation of master/slave, I would like to make
>> pluggable strategies, so we can provide for more of a distributed cache,
>> partitioning, and other types of joins, such as mcast/heart beat for
>> those who want it.
>>
>> Thoughts and additional ideas?
>>
> 
> Well, it made me think of lots of questions!
> 
> Are there any good references on "principles of clustering" for newbies
> like me?

Oh yes...how about the best of them all, Tangasol.  Cameron and friends
did about the best clustering writeup of any doc I have ever seen on
clustering.:

http://wiki.tangosol.com/display/COH32UG/Coherence+3.2+Home

> 
> thanks
> david jencks
> 
>> Thanks,
>>
>> Jeff

Mime
View raw message