Return-Path: X-Original-To: archive-asf-public-internal@cust-asf2.ponee.io Delivered-To: archive-asf-public-internal@cust-asf2.ponee.io Received: from cust-asf.ponee.io (cust-asf.ponee.io [163.172.22.183]) by cust-asf2.ponee.io (Postfix) with ESMTP id A6C022009F4 for ; Thu, 26 May 2016 11:51:25 +0200 (CEST) Received: by cust-asf.ponee.io (Postfix) id A5366160A10; Thu, 26 May 2016 09:51:25 +0000 (UTC) Delivered-To: archive-asf-public@cust-asf.ponee.io Received: from mail.apache.org (hermes.apache.org [140.211.11.3]) by cust-asf.ponee.io (Postfix) with SMTP id EEA27160939 for ; Thu, 26 May 2016 11:51:24 +0200 (CEST) Received: (qmail 86423 invoked by uid 500); 26 May 2016 09:51:24 -0000 Mailing-List: contact user-help@couchdb.apache.org; run by ezmlm Precedence: bulk List-Help: List-Unsubscribe: List-Post: List-Id: Reply-To: user@couchdb.apache.org Delivered-To: mailing list user@couchdb.apache.org Received: (qmail 86411 invoked by uid 99); 26 May 2016 09:51:23 -0000 Received: from pnap-us-west-generic-nat.apache.org (HELO spamd2-us-west.apache.org) (209.188.14.142) by apache.org (qpsmtpd/0.29) with ESMTP; Thu, 26 May 2016 09:51:23 +0000 Received: from localhost (localhost [127.0.0.1]) by spamd2-us-west.apache.org (ASF Mail Server at spamd2-us-west.apache.org) with ESMTP id 71BF61A5E2F for ; Thu, 26 May 2016 09:51:23 +0000 (UTC) X-Virus-Scanned: Debian amavisd-new at spamd2-us-west.apache.org X-Spam-Flag: NO X-Spam-Score: -0.7 X-Spam-Level: X-Spam-Status: No, score=-0.7 tagged_above=-999 required=6.31 tests=[DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_LOW=-0.7] autolearn=disabled Authentication-Results: spamd2-us-west.apache.org (amavisd-new); dkim=pass (1024-bit key) header.d=messagingengine.com Received: from mx1-lw-eu.apache.org ([10.40.0.8]) by localhost (spamd2-us-west.apache.org [10.40.0.9]) (amavisd-new, port 10024) with ESMTP id KZtmTgXRbzya for ; Thu, 26 May 2016 09:51:20 +0000 (UTC) Received: from out3-smtp.messagingengine.com (out3-smtp.messagingengine.com [66.111.4.27]) by mx1-lw-eu.apache.org (ASF Mail Server at mx1-lw-eu.apache.org) with ESMTPS id E64E75F3F3 for ; Thu, 26 May 2016 09:51:19 +0000 (UTC) Received: from compute7.internal (compute7.nyi.internal [10.202.2.47]) by mailout.nyi.internal (Postfix) with ESMTP id 2821F20638 for ; Thu, 26 May 2016 05:51:13 -0400 (EDT) Received: from frontend1 ([10.202.2.160]) by compute7.internal (MEProxy); Thu, 26 May 2016 05:51:13 -0400 DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-sasl-enc:x-sasl-enc; s=smtpout; bh=zKPeDaxW0BSoGz5 4NCeOk+FN1KI=; b=jt6QeHl6qC8wlB3kkdQ9VA+BUYkdbV52Afe19g4HXm0u6t1 TK61b4yNEQzJpmfS2ydHwIEjT7bS96I/bNfG85WImxj0lqTMYCynTLkykD0TIgdl vcZGMiZRZ4rOdB3BIZ09Yx0KGO0D0b8L4ikNluzIViyd0ompylqmDrIcXvlA= X-Sasl-enc: rdc4wP17tHO03f/mM9KhMzuqynqy55LEqsaKpi/E900n 1464256272 Received: from [198.18.79.220] (unknown [217.146.29.74]) by mail.messagingengine.com (Postfix) with ESMTPA id BED70F2A09 for ; Thu, 26 May 2016 05:51:12 -0400 (EDT) From: Robert Newson Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (1.0) Subject: Re: The state of filtered replication Message-Id: Date: Thu, 26 May 2016 09:51:09 +0000 References: <78765234-9DCE-4D77-8267-18342D494051@medicmobile.org> <8ED3123A-2470-4F56-987C-5E46EFBB6871@gmail.com> In-Reply-To: <8ED3123A-2470-4F56-987C-5E46EFBB6871@gmail.com> To: user@couchdb.apache.org X-Mailer: iPhone Mail (13F69) archived-at: Thu, 26 May 2016 09:51:25 -0000 There must be something else wrong. Filtered replications definitely make an= d resume from checkpoints, same as unfiltered. We mix the filter code and parameters into the replication checkpoint id to e= nsure we start from 0 for a potentially different filtering. Perhaps you are= changing those? Or maybe supplying since_seq as well (which overrides the c= heckpoint)? Sent from my iPhone > On 25 May 2016, at 16:39, Paul Okstad wrote: >=20 > This isn=E2=80=99t just a problem of filtered replication, it=E2=80=99s a m= ajor issue in the database-per-user strategy (at least in the v1.6.1 I=E2=80= =99m using). I=E2=80=99m also using a database-per-user design with thousand= s of users and a single global database. If a small fraction of the users (h= undreds) has continuously ongoing replications from the user DB to the globa= l DB, it will cause extremely high CPU utilization. This is without any repl= ication filtered javascript function. >=20 > Another huge issue with filtered replications is that they lose their plac= e when replications are restarted. In other words, they don=E2=80=99t keep t= rack of sequence ID between restarts of the server or stopping and starting t= he same replication. So for example, if I want to perform filtered replicati= on of public documents from the global DB to the public DB, and I have a ton= of documents in global, then each time I restart the filtered replication i= t will begin from sequence #1. I=E2=80=99m guessing this is due to the fact t= hat CouchDB does not know if the filter function has been modified between r= eplications, but this behavior is still very disappointing. >=20 > =E2=80=94=20 > Paul Okstad > http://pokstad.com >=20 >=20 >=20 >> On May 25, 2016, at 4:25 AM, Stefan Klein wrote: >>=20 >> 2016-05-25 12:48 GMT+02:00 Stefan du Fresne : >>=20 >>=20 >>=20 >>> So to be clear, this is effectively replacing replication=E2=80=94 where= the >>> client negotiates with the server for a collection of changes to downloa= d=E2=80=94 >>> with a daemon that builds up a collection of documents that each client >>> should get (and also presumably delete), which clients can then query fo= r >>> when they=E2=80=99re able? >>=20 >> Sorry, didn't describe well enough. >>=20 >> On Serverside we have one big database containing all documents and one d= b >> for each user. >> The clients always replicate to and from their individual userdb, >> unfiltered. So the db for a user is a 1:1 copy of their pouchdb/... on >> their client. >>=20 >> Initially we set up a filtered replication for each user from servers mai= n >> database to the server copy of the users database. >> With this we ran into performance problems and sooner or later we probabl= y >> would have ran into issues with open file descriptors. >>=20 >> So what we do instead is listening to the changes of the main database an= d >> distribute the documents to the servers userdb, which then are synced wit= h >> the clients. >>=20 >> Note: this is only for documents the users actually work with (as in >> possibly modify), for queries on the data we query views on the main >> database. >>=20 >> For the way back, we listen to the _dbchanges, so we get an event for >> changes on the users dbs, get that change from the users db and determine= >> what to do with it. >> We do not replicate back users changes to the main database but rather ha= ve >> an internal API to evaluate all kinds of constrains on users input. >> If you do not have to check users input, you could certainly listen to >> _dbchanges and "blindly" one-shot replicate from the changed DB to your >> main DB. >>=20 >> --=20 >> Stefan >=20