incubator-wadi-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From lichtner <>
Subject Re: Replication using totem protocol
Date Fri, 13 Jan 2006 18:02:00 GMT

> > If you cluster an entity bean on two nodes naively, you lose many of the
> > benefits of caching. This is because neither node, at the beginning of a
> > transaction, knows whether the other node has changed the beans contents
> > since it was last loaded into cache, so the cache must be assumed
> > invalid. Thus, you find yourself going to the db much more frequently
> > than you would like, and the number of trips increases linearly with the
> > number of clients - i.e. you are no longer scalable.
> It depends on your transaction isolation level; i.e. do you want to do a
> dirty read or not. You should be able to enable dirty reads to get
> scalability & performance.

I like dirty reads from a theoretical standpoint because if you can do
dirty reads it means you have a high-class message bus. However, I don't
expect people to ask for dirty reads unless you mean that they are going
to roll back automatically. Example: inventory.

Non-transactional applications however could use dirty data and still
find it useful even if they don't roll back.

> The only way to really and truly know if the cache is up to date is to use a
> pessimistic read lock; but thats what databases are great for - so you might
> as well use the DB and not the cache in those circumstances. i.e. you always
> use caches for dirty reads

Major databases currently do not use read locks. Oracle and SQL Server use
mv2pl (multiversion two phase locking.) MySQL on the InnoDB storage engine
also. PostgresSQL. (Interestingly, the first I article I know on this is
Bernstein and Goodman, 1978 (!)). Sybase I don't know. I think it may have
fallen a bit behind.

When a tx starts you assign a cluster-wide unique id to it. That's its
'position' in time (known in Oracle as the scn, system change number).
When the tx writes a data item it creates a new version, tagged with this
scn. When a transaction wants to read the data, it reads the last version
before _its_ scn. So when you read you definitely don't need a lock. When
you write you can either use a lock (mv2pl) or proceed until you find a
conflict, in which case you roll back. The latter should be used in
workloads that have very little contention. Or you can use in general also
but you need to have automatic retries, as with mdbs, and you should
really not send any data to the dbms until you know for sure to cut down
on the time required to roll back.

> * invalidation; as each entity changes it sends an invalidation message -
> which is really simple & doesn't need total ordering, just something
> lightweight & fast. (Actually pure multicast is fine for invalidation stuff
> since messages are tiny & reliability is not that big a deal, particularly
> if coupled with a cache timeout/reload policy).
> * broadcasting the new data to interested parties (say everyone else in the
> cluster). This typically requires either (i) a global publisher (maybe
> listening to the DB transaction log) or (ii) total ordering if each entity
> bean server sends its changes.

That's the one.

> The former is good for very high update rates or very sparse caches, the
> latter is good when everyone interested in the cluster needs to cache mostly
> the same stuff & the cache size is sufficient that most nodes have all the
> same data in their cache. The former is more lightweight and simpler & a
> good first step :)

You can split up the data also. You can keep 4 replicas of each data item
instead of N, and just migrate it around. But for semi-constant data like
reference data, e.g. stock symbols or client data you can keep copies


View raw message