geronimo-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Rajith Attapattu <rajit...@gmail.com>
Subject Re: Replication using totem protocol
Date Wed, 18 Jan 2006 02:27:28 GMT
Thanks a lot for the info !!!

Regards,

Rajith


On 1/17/06, lichtner <lichtner@bway.net> wrote:
>
>
> By reading selected parts of this book you can get a background on various
> issues that you have asked about:
>
> http://citeseer.ist.psu.edu/birman96building.html
>
> On Tue, 17 Jan 2006, Rajith Attapattu wrote:
>
> > > Can u guys talk more about locking mechanisms pros and cons wrt in
> memory
> > > replication and storaged backed replication.
> >
> > >I don't know what you have in mind here by 'storage-backed'.
> >
> > Sorry if I was not clear on that. what i meant was in memory vs
> serialized
> > form, either stored in a file or database or some other mechanism.
> >
> > >>you want to guarantee that the user's work is _never_lost, just send
> all
> > session updates to yourself in a totem-protocol 'safe' message
> > hmm can we really make a garuntee here even that you assumption
> > holds (Assuming 4 nodes and likely to survive node crashes up to 4 - R =
> 2
> > node crashes.)
> >
> > Also I didn't understand how u arrived at the 4-R value. I guess it's
> bcos I
> > don't have much knowledge about totem.
> > If there is a short answer and if it's not beyond the scope of the
> thread
> > can u try one more time to explain the thoery behind your assumption
> >
> > Regards,
> >
> > Rajith.
> >
> > On 1/17/06, lichtner <lichtner@bway.net> wrote:
> > >
> > >
> > > On Tue, 17 Jan 2006, Rajith Attapattu wrote:
> > >
> > > > Can u guys talk more about locking mechanisms pros and cons wrt in
> > > memory
> > > > replication and storaged backed replication.
> > >
> > > I don't know what you have in mind here by 'storage-backed'.
> > >
> > > > Also what if a node goes down while the lock is aquirred?? I assume
> > > there is
> > > > a time out.
> > >
> > > Which architecture do you have in mind here? I think the question is
> > > relevant if you use a standalone lock server, but if you don't then
> you
> > > just put the lock queue with the data item in question.
> > >
> > > > When it comes to partition (either network/power failure or vistual)
> or
> > > > healing (same new nodes comming up as well??) what are some of the
> > > > algorithms and stratergies that are widely used to handle those
> > > situations
> > > > ?? any pointers will be great.
> > >
> > > I believe the best strategy depends on what type of state the
> application
> > > has. Clearly if the state took zero time to transfer over you could
> > > compare version numbers, transfer the state to the nodes that happen
> to be
> > > out-of-date, and you are back in business. OTOH if the state is 1Gb
> you
> > > will take a different approach. There is not much to look up here.
> Think
> > > about it carefull and you can come up with the best state transfer for
> > > your application.
> > >
> > > Session state is easier than others because it consists of miryads
> small,
> > > independent data items that do not support concurrent access.
> > >
> > > > so if u are in the middle of filling a 10 page application on the
> web
> > > and
> > > > while in the 9th page and the server goes down, if you can restart
> again
> > > > with the 7 or 8th page (a resonable percentage of data was preserved
> > > through
> > > > merge/split/change) I guess it would be tolarable if not excellent
> in a
> > > very
> > > > busy server.
> > >
> > > Since this is a question about availability consider a cluster, say 4
> > > nodes, with a minimum R=2, say, where all the sessions are replicated
> on
> > > _each_ node. If you want to guarantee that the user's work is _never_
> > > lost, just send all session updates to yourself in a totem-protocol
> 'safe'
> > > message, which is delivered only after the message has been received
> (but
> > > not delivered) by all the nodes, and wait for your own message to
> arrive.
> > > This takes between 1 and 2 token rotations, which on 4 nodes I guess
> would
> > > be between 10-20 milliseconds, which is not a lot as http request
> > > latencies go.
> > >
> > > As a result of this after an http request returns, the work done is
> likely
> > > to survive node crashes up to 4 - R = 2 node crashes.
> > >
> > >
> >
>

Mime
View raw message