couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From Robert Newson <rnew...@apache.org>
Subject Re: Replication and checkpoints - what to expect?
Date Fri, 13 Jul 2012 15:20:17 GMT
https://issues.apache.org/jira/browse/CouchDB

B.

On 13 Jul 2012, at 07:55, Mathias Leppich wrote:

> Oh, now I'm getting you point! 
> 
> The issue is that a filtered replication doesn't periodically checkpoint if no changes
match the filter. This looks like a gap in the continuous replication "protocol". The filtered
_changes feed that is used by the continuous replication should emit "empty" changes (only
seq) when used with the heartbeat parameter.
> 
> Your idea of a "synchronization" document that passes all filters seems like a pretty
straight fwd workaround. 
> 
> - mathias
> 
> On Jul 13, 2012, at 2:33 , Andreas Kemkes wrote:
> 
>> Mathias:
>> 
>> I had planned to not allow new entries into the source to let the continuous replications
catch up, but I don't see how your approach changes the conundrum that "source_seq" - "checkpointed_source_seq"
will only be 0 for the exceptional case that the last entry into the source gets through the
filter.
>> 
>> Given full coverage, there will be at least one replication globally where this value
is indeed zero.  Maybe a document that doesn't get filtered by any of the replications is
the workaround (do-it-yourself boundary synchronization).
>> 
>> I like your idea of graphing the "lag of changes".  That may come in handy in other
replication patterns.
>> 
>> Thanks,
>> 
>> Andreas
>> 
>> From: Mathias Leppich <mleppich@muhqu.de>
>> To: user@couchdb.apache.org; Andreas Kemkes <a5sk4s@yahoo.com> 
>> Sent: Thursday, July 12, 2012 12:13 AM
>> Subject: Re: Replication and checkpoints - what to expect?
>> 
>> Hi Andreas,
>> 
>> with continuous replications and an ever changing dataset there is no point where
you can tell your replication is "up-to-date" in terms of "100% replicated" as replication
always happens after the data has been written to the source database. (which is a good thing)
>> 
>> You need to change the way you measure the up-to-date-ness. Instead of measuring
the percentage of completion you should better me sure the lag of changes. e.g. targetDB is
N changes behind sourceDB. 
>> 
>> With couchdb 1.2 you get this number with a single request to /_active_tasks by calculating
a replications "source_seq" - "checkpointed_source_seq". Prior to 1.2 you can get this number
too but its a more difficult because you have to know the replications _local ID and check
the "source_last_seq" field in the replications session document… 
>> 
>> Once you have the "lag of changes" for your continuous replications its a good thing
to graph it with some monitoring tool to get a big picture of how replication performance
is going through the day.
>> 
>> - mathias
>> 
>> On Jul 12, 2012, at 1:31 , Andreas Kemkes wrote:
>> 
>>> I wanted to follow up on this thread as I'm still experience difficulties using
the feature and would like some advise how to best deal with the situation.
>>> 
>>> The goal is to break up a monolithic database into multiple, which was achieved
after a lot of trial and error.  Now the quest is to keep it in sync for a while by using
filtered, continuous replications.  Yet the replication gets stuck on the last sequence number
that passes the filter.  In the Futon UI, I see:
>>> 
>>> Checkpointed source sequence 165850, current source sequence 166253, progress
99%
>>> 
>>> If I start a non-continuous replication with the exact same parameters, it returns:
>>> 
>>> 
>>> {
>>>   "ok": true,
>>>   "no_changes": true,
>>>   "session_id": ...,
>>>   "source_last_seq": 165850,
>>>   "replication_id_version": 2,
>>> ...
>>> }
>>> It apparently knows that there are no changes and it knows the current source
sequence.  Why could it not move the checkpointed source sequence forward to match the current
source sequence?  What am I missing?
>>> 
>>> Unless there is an exact match between checkpointed and current source sequence,
how would one ever know if a replication is up-to-date?
>>> 
>>> -- Andreas
>>> 
>>> 
>>> ________________________________
>>> From: Filipe David Manana <fdmanana@apache.org>
>>> To: user@couchdb.apache.org; Andreas Kemkes <a5sk4s@yahoo.com> 
>>> Sent: Thursday, June 21, 2012 12:40 PM
>>> Subject: Re: Replication and checkpoints - what to expect?
>>> 
>>>> The same should be true for filtered replications if there is no applicable
document between the current source sequence and the last checkpoint.  Otherwise you would
be always wondering if it has been replicated entirely.
>>> 
>>> That's harder. With filtered replication, we only know about sequence
>>> numbers of changes that pass the filter.
>> 
>> 
>> 
> 


Mime
View raw message