directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Alex Karasulu" <akaras...@apache.org>
Subject Re: [ApacheDS][Mitosis] Replication data
Date Fri, 23 Nov 2007 03:42:29 GMT
Hey Martin,

I just saw this email however I'm out for Thanksgiving (holiday here).
 I'll try to get back to you at some point soon.

Cheers,
Alex

On Nov 22, 2007 6:19 PM, Martin Alderson <equim@planetquake.com> wrote:
> Hi all,
>
> I am currently looking into some of the replication issues, specifically
> DIRSERVER-894 ("Older concurrent changes are never replicated"),
> DIRSERVER-1097 ("Only send net changes during replication") and
> DIRSERVER-1101 ("New replicas may never receive some recent modifications").
>
> I think these issues will require changing the replication data format.
>   Currently the replication logs are stored in a single database table
> with time, replica ID, sequence number and operation columns.  The first
> 3 comprise the CSN and the last is for a serialised operation object.
>
> DIRSERVER-894 needs a way to work out the CSN at the point a specific
> attribute was last modified.  DIRSERVER-1097 needs a way to find
> previous log entries based on entryUUID, modification type and attribute
> ID.  We are also planning on moving the replication data to the DIT.
> Given all this I am thinking of removing the serialised operation blob
> and replacing it with extra table(s) for each operation type storing the
> operation's data across multiple columns.  This will allow us to
> efficiently query the replication logs based on the operation data.
>
> Perhaps this would be a good time to make the jump to storing the
> replication data in the DIT.  It seems that that would be well suited to
> storing the operations in an "exploded" format.  I am thinking of the
> following kind of format:
>
> ou=logs/
>    cn=<csn>/
>        objectClass: ... (indicates operation type)
>        time: ...
>        replicaID: ...
>        operationSequence: ...
>        entryUUID: ...
>        attributeID: <attributeName> (for attribute modifications)
>        cn=attributes/
>          <attributeName>: <attributeValues>
>
> The biggest concern I have for this is the inflexibility of LDAP
> searches.  Do we have a sort control in ApacheDS?  Also, if we have the
> attributes for the operation in a child entry how can we find an
> operation in the logs based on those attributes.
>
> At the same time I am thinking about a couple of things in the
> replication system that don't seem to be necessary.
>
> Firstly, once DIRSERVER-894 is fixed, I don't think we will need the
> entryCSN attribute.  I believe that it is only used to check whether an
> operation should be applied to an entry or not (i.e. is it a new
> modification), but this is broken and we need to check the CSN per
> attribute by using the logs instead.
>
> Secondly, I don't really see the point of "tombstoning" entries (marking
> them as deleted instead of really deleting them).  The only time I can
> see it having any kind of effect is when a replica receives a
> modification for an entry it thinks has been deleted - then it will
> resurrect it.  This seems like a very bad idea to me.  I would expect
> this to be a fatal replication error as something has gone seriously wrong.
>
> Sorry for the long email... if anyone's managed to read this far any
> comments would be much appreciated.
>
> Thanks,
>
> Martin
>

Mime
View raw message