couchdb-user mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From James Dingwall <>
Subject Re: Dynamic Filtered Replication
Date Tue, 28 Aug 2018 13:12:26 GMT
On 28/08/18 13:51, Andrea Brancatelli wrote:
> Hello everybody.
> I have a pretty academic question about CouchDB replication…
> I’m working on creating a setup to test my idea but I thought asking could save me
some headaches.
> This is the scenario.
> Let’s suppose I have two database: Master and Slave with a Filtered Replication that
uses some information stored in the _user database to determine wether to replicate the document
or not …
> Now we put 100 docs in the Master DB, so Master Update SEQ reaches, let’s say, 100.
> The filter has filtered the replicated documents and on the Slave DB we have, let’s
say 40 documents.
> The condition in the _user database changes and the new condition would mach 60 documents
of the Master.
> Is there an “easy” way to refresh the sincronization between the two database and
have the 20 missing docs synced?
> Maybe resetting the replication would automagically fix this (like the replication skipping
the docs that are already in the Slave db?)
> How would you handle such a situation? Especially when the amount of docs is not 100...

My experience with 1.x are that this can be done by:

1. recreate the replication with a new _id
2. recreate the replication with the same _id but add some query
parameter to make it unique from the previous
3. recreate the replication with the same _id but remove the _local
document relating to the replication from master and slave dbs.  (Use
the document name is _local/<replication_id> where replication_id can be
found from the related task and is processed with (python)
4. if there is no unique content in the slave and you can afford to miss
it, just delete it and it will be re-created if your replication is
allowed to create_target.

By observation the status is maintained in a _local document using a
checksum of the replication JSON (perhaps the same calculation used to
generate the _rev) so unless you change the JSON defining the
replication it will resume from the last sequence recorded in the _local

My usual approach is 3 although poking at the _local document probably
isn't supported. The sequence I use is:
  - delete replication definition (_deleted: true, not an HTTP DELETE
otherwise replication processes may not be correctly terminated)
  - remove _local documents from master and slave (which may not be
present depending on the status of the existing replication)
  - re-create the replication with the same JSON content as before

The main issue is when your condition for replication changes such that
documents already present in the slave would no longer be replicated
according to your new criteria, in this case 4 is the only solution.

> I hope my question is clear enough.
> Thanks a lot.
> -------
> Andrea Brancatelli

Zynstra is a private limited company registered in England and Wales (registered number 07864369).
Our registered office and Headquarters are at The Innovation Centre, Broad Quay, Bath, BA1
1UD. This email, its contents and any attachments are confidential. If you have received this
message in error please delete it from your system and advise the sender immediately.

View raw message