directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Martin Alderson" <Martin.Alder...@salfordsoftware.co.uk>
Subject Re: [Mitosis] Push or pul, plus some random thoughts
Date Thu, 15 Jan 2009 13:09:58 GMT
> In fact, when reconnecting, the replica should indicate what was the
> latest CSN it received, so the server can push back the modifications
> from this CSN up to the latest local CSN.

> There are two issues with this approach :
> - the deleted entries.
> - if the replica is connected to more than one other server, it will
> receive a hell lot of modifications from all the connected server at
> the same time.

I'm not sure why deleted entries are any different from other modifications here.  The delete
modification will be sent to the connecting replica where it will be applied.

Having lots of replicas shouldn't be a problem when a downed replica comes back up.  The first
replica to start replicating to the newly revived replica will acquire a lock - all other
replicas will wait until the next replication cycle.  This would also become much less of
a problem if we don't need a replica to be connected to all other replicas.


> Right now, from the code I can read, the deleted enties are
> "tombstoned". Maybe we can get rid of this, as we also store the
> delete operation into the derby Store at this point.

Yeah, I should have phrased it as "We _have_ tombstone entries but don't _use_ them".


> I rethougt about this and the problem is that we won't be able to
> resync a server disconnected for a too long period, unless we simply
> erase its full base and ask for all the entries. can be costly when
> you have millions of entries ! However, in this very case (let's say
> you keep a one week period modifications), if you get out of this
> window, the best would probably to restore the base from a backup (way
> faster than reinjecting all the entries one by one !), and then resync
> using the modification log.
> 
> So the modification log should only contain a limited number of
> modifications, depending on the configured storage period.
> 
> In order to get this working, we have to implement a decent DRS too
> (Disaster Recovery System), which is on its way, as it's just a
> specific implementation of the current changelog interceptor (we have
> to store on disk the modifications, but not the reverse modifications,
> as it's done with the ChangeLog mechanism).

Exactly.  When a replica comes back up after more than a week's downtime then it would have
to be treated as a new replica, with its DIT replaced.

I think Alex mentioned a while back that it would be good to merge the changelog interceptor
with the replication system - it seems a waste to have two pieces of code both maintaining
modification logs.  I'm not sure how close they are though.

It would also be good to have automatic backups as just another replica with certain options
(read only, sync'd periodically).


Martin



Mime
View raw message