Hi Emmanuel,

On Sat, Jan 17, 2009 at 7:15 PM, Emmanuel Lecharny <elecharny@gmail.com> wrote:
Last, not least : the triggers. If some modification can triggers some other (because of integrity constraints being activated), then it should be logged in the change log. When replicating, the triggers _must_ be disabled, as the merged operations will contain all the triggered operations.

This is one way to handle it but it could be very expensive.  If the trigger firing impacts many entries or results in a cascade of firings, then the cost of replicating the changes could be very large.

Triggers are modeled as entries.  As entries they will themselves be replicated.  It would be nice if the trigger on a consumer could fire and do all the work so we could avoid unnecessary network traffic.  This is all nice but it gets really complicated really fast.

Before going on to talk about triggers let's stop for a second and talk about how replication events must be handled by a consumer.  The consumer must make sure that whatever change is to be applied to the DIT (except for delete operations) must have the proper operational attributes applied.  More specifically the following basic operational attributes need the proper values:

createTimestamp
creatorsName
modifyTimestamp
modifiersName

So the replication event should contain the who and the time at which the operation actually occurred rather than the current time for example.  Hence replication event processing must perform operations against the DIT with the identity of the client making the change at that time on the supplier.  So unlike a regular operation, an operation to apply replication deltas, must use different values for these attributes. In a way this kind of operation is not a direct operation against the consumer, but an indirect operation.

Direct operations by clients may raise, triggers which may perform additional operations against the DIT.  These triggered operation can themselves raise triggers that cause more changes.  A cascade may result although should be constrained through various means.  The server is designed to track the fact that a triggered change is occuring because of another change.  This is tracked through a linked list where at the head you'll find the operation that started it all.  All the triggered operations are treated as indirect operations caused by the operation at the head.

The point I want to make is we already have some machinery here for tracking direct and indirect opertations.  Although presently triggers don't work and the tracking mechanism lacks a way to put the same timestamp on all changed entries as if they happened at the same time, it should have this.  The server must treat replication operations at the consumer in a similar fashion and apply timestamps properly.  It can also do the same with respect to the changes due to triggers even if the operation in question is replicated or not.

This is the main worry with triggers and if we can properly solve this problem in a simple and easy to maintain way then we're golden.

Alex