directory-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Kiran Ayyagari <>
Subject Re: Replication heads up
Date Mon, 08 Aug 2011 09:35:31 GMT
On Mon, Aug 8, 2011 at 2:26 PM, Emmanuel Lecharny <> wrote:
> Hi guys,
> so we found the reason why the replication tests are failing randomly. Let
> me explain :
> - the consumer is connected to the provider until it gets disconnected. It
> can last for days or weeks.
> - the producer pushes modifications to the consumer directly if the consumer
> is connected
> - if the consumer is disconnected, the modifications are stored in a queue,
> waiting for the client to reconnect to send it the content of this queue
> That being said, we have one corner case when the provider 'thinks' that the
> consumer is connected when it's not anymore : the message is sent to the
> disconnected client, and we don't push it to the queue, losing it.
> One better idea is to push *all* the modifications to the queue, not matter
> what. Then a thread will process this queue and send it contents to the
> client, unless the client isn't connected. In any case, we *don't* delete
> messages from the queue. Never.
> That raises a question : what o we do in the long term ? The queue will grow
> and never shrink. In fact this is quite simple : we truncate the queue after
> a defined period of time (say once a day, or once a week). Ever modification
> older than the interval is simply deleted from the queue.
> What if a consumer is not able to reconnect within this period of time ?
> Simple :
> - the consumer sends the lastEntryCSN it received, and if it's older than
> what's in the queue, then we do a full replication.
> It may seems costly, but it's unlikely that a consumer get disconnected for
> a long period of time. All in all, it's like if we just added a brand new
> consumer, with nothing in it.
> One option would be to ask the consumer to send a periodic message to the
> producer informing it that it's up to date. It could be a daily unbind/bind
> for instance. The unbind will kill the pending persistent search we
> established between the producer and consumer, to establish a new one. As we
> will send a new request, with the lastEntryCSN, we will be able to truncate
> the provider queue, so it won't grow forever.
this case is already handled(in my recent commit), i.e., when a
consumer reconnects we remove all the entries from log that are older
than the CSN value present in the cookie.

Coming to restarting the consumer at periodic intervals is an
interesting idea, this perfectly solves many cases of 'how to
prune/truncate the log' except in cases of a consumer that never
reconnects, in which case we need to go for a time based policy

> We will probably work around this idea with Kiran this week. I'm positive
> that it can work well by the end of this week, or even earlier.
> Stay tuned !
thanks for the putting these in ink, Emmanuel
> --
> Regards,
> Cordialement,
> Emmanuel L├ęcharny

Kiran Ayyagari

View raw message