Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm
Precedence: bulk
Reply-To: user@couchdb.apache.org
From: Paul Okstad <pokstad@gmail.com>
Content-Type: text/plain;
	charset=utf-8
Content-Transfer-Encoding: quoted-printable
Mime-Version: 1.0 (1.0)
Subject: Re: The state of filtered replication
Message-Id: <59AF1472-E7AF-48CC-93F5-ED401A1159EB@gmail.com>
Date: Thu, 26 May 2016 07:04:19 -0700
References: <78765234-9DCE-4D77-8267-18342D494051@medicmobile.org> <CAA=cwqFAo2hbivrqsWkX_fFks2YCGhCN7+8GCOf-3AeZpvCsag@mail.gmail.com> <EEBA9626-15C6-4346-929D-999EDF10D63E@medicmobile.org> <CAA=cwqHRB_rgZPzJ3vttrp6t9t-r4D7Pz96mtVry7wC3cjkrzw@mail.gmail.com> <8ED3123A-2470-4F56-987C-5E46EFBB6871@gmail.com> <FEB513B7-1159-4C31-B55D-7E27DBBEDD66@apache.org>
In-Reply-To: <FEB513B7-1159-4C31-B55D-7E27DBBEDD66@apache.org>
To: user@couchdb.apache.org
archived-at: Thu, 26 May 2016 14:04:35 -0000

I'll double check my situation since I have not thoroughly verified it. This=
 particular issue occurs between restarts of the server where I make no chan=
ges to the continuous replications in the _replicator DB, but it may also be=
 related to the issue of too many continuous replications causing a replicat=
ions to stall out from lack of resources. It's possible that I assumed they w=
ere starting over from seq 1 when in fact they were never able to complete a=
 full replication in the first place.

--=20
Paul Okstad

> On May 26, 2016, at 2:51 AM, Robert Newson <rnewson@apache.org> wrote:
>=20
> There must be something else wrong. Filtered replications definitely make a=
nd resume from checkpoints, same as unfiltered.
>=20
> We mix the filter code and parameters into the replication checkpoint id t=
o ensure we start from 0 for a potentially different filtering. Perhaps you a=
re changing those? Or maybe supplying since_seq as well (which overrides the=
 checkpoint)?
>=20
> Sent from my iPhone
>=20
>> On 25 May 2016, at 16:39, Paul Okstad <pokstad@gmail.com> wrote:
>>=20
>> This isn=E2=80=99t just a problem of filtered replication, it=E2=80=99s a=
 major issue in the database-per-user strategy (at least in the v1.6.1 I=E2=80=
=99m using). I=E2=80=99m also using a database-per-user design with thousand=
s of users and a single global database. If a small fraction of the users (h=
undreds) has continuously ongoing replications from the user DB to the globa=
l DB, it will cause extremely high CPU utilization. This is without any repl=
ication filtered javascript function.
>>=20
>> Another huge issue with filtered replications is that they lose their pla=
ce when replications are restarted. In other words, they don=E2=80=99t keep t=
rack of sequence ID between restarts of the server or stopping and starting t=
he same replication. So for example, if I want to perform filtered replicati=
on of public documents from the global DB to the public DB, and I have a ton=
 of documents in global, then each time I restart the filtered replication i=
t will begin from sequence #1. I=E2=80=99m guessing this is due to the fact t=
hat CouchDB does not know if the filter function has been modified between r=
eplications, but this behavior is still very disappointing.
>>=20
>> =E2=80=94=20
>> Paul Okstad
>> http://pokstad.com <http://pokstad.com/>
>>=20
>>=20
>>=20
>>> On May 25, 2016, at 4:25 AM, Stefan Klein <st.fankl.in@gmail.com> wrote:=

>>>=20
>>> 2016-05-25 12:48 GMT+02:00 Stefan du Fresne <stefan@medicmobile.org>:
>>>=20
>>>=20
>>>=20
>>>> So to be clear, this is effectively replacing replication=E2=80=94 wher=
e the
>>>> client negotiates with the server for a collection of changes to downlo=
ad=E2=80=94
>>>> with a daemon that builds up a collection of documents that each client=

>>>> should get (and also presumably delete), which clients can then query f=
or
>>>> when they=E2=80=99re able?
>>>=20
>>> Sorry, didn't describe well enough.
>>>=20
>>> On Serverside we have one big database containing all documents and one d=
b
>>> for each user.
>>> The clients always replicate to and from their individual userdb,
>>> unfiltered. So the db for a user is a 1:1 copy of their pouchdb/... on
>>> their client.
>>>=20
>>> Initially we set up a filtered replication for each user from servers ma=
in
>>> database to the server copy of the users database.
>>> With this we ran into performance problems and sooner or later we probab=
ly
>>> would have ran into issues with open file descriptors.
>>>=20
>>> So what we do instead is listening to the changes of the main database a=
nd
>>> distribute the documents to the servers userdb, which then are synced wi=
th
>>> the clients.
>>>=20
>>> Note: this is only for documents the users actually work with (as in
>>> possibly modify), for queries on the data we query views on the main
>>> database.
>>>=20
>>> For the way back, we listen to the _dbchanges, so we get an event for
>>> changes on the users dbs, get that change from the users db and determin=
e
>>> what to do with it.
>>> We do not replicate back users changes to the main database but rather h=
ave
>>> an internal API to evaluate all kinds of constrains on users input.
>>> If you do not have to check users input, you could certainly listen to
>>> _dbchanges and "blindly" one-shot replicate from the changed DB to your
>>> main DB.
>>>=20
>>> --=20
>>> Stefan
>=20